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Preface 


The 1990s saw a wave of calculus reform whose aim was to teach students to think 
for themselves and to solve substantial problems, rather than merely memorizing 
formulas and performing rote algebraic manipulations. This book has a similar, 
albeit somewhat more ambitious, goal: to lead you to think mathematically and 
to experience the thrill of independent intellectual discovery. Our chosen subject, 
Number Theory, is particularly well suited for this purpose. The natural numbers 
1, 2, 3, ... satisfy a multitude of beautiful patterns and relationships, many of 
which can be discerned at a glance; others are so subtle that one marvels they were 
noticed at all. Experimentation requires nothing more than paper and pencil, but 
many false alleys beckon to those who make conjectures on too scanty evidence. It 
is only by rigorous demonstration that one is finally convinced that the numerical 
evidence reflects a universal truth. This book will lead you through the groves 
wherein lurk some of the brightest flowers of Number Theory, as it simultaneously 
encourages you to investigate, analyze, conjecture, and ultimately prove your own 
beautiful number theoretic results. 

This book was originally written to serve as a text for Math 42, a course created 
by Jeff Hoffstein at Brown University in the early 1990s. Math 42 was designed to 
attract nonscience majors, those with little interest in pursuing the standard calculus 
sequence, and to convince them to study some college mathematics. The intent was 
to create a course similar to one on, say, “The Music of Mozart” or “Elizabethan 
Drama,” wherein an audience is introduced to the overall themes and methodology 
of an entire discipline through the detailed study of a particular facet of the subject. 
Math 42 has been extremely successful, attracting both its intended audience and 
also scientifically oriented undergraduates interested in a change of pace from their 
large-lecture, cookbook-style courses. 

The prerequisites for reading this book are few. Some facility with high school 
algebra is required, and those who know how to program a computer will have fun 
generating reams of data and implementing assorted algorithms, but in truth the 
reader needs nothing more than a simple calculator. Concepts from calculus are 
mentioned in passing, but are not used in an essential way. However, and the reader 
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is hereby forewarned, it is not possible to truly appreciate Number Theory without 
an eager and questioning mind and a spirit that is not afraid to experiment, to make 
mistakes and profit from them, to accept frustration and persevere to the ultimate 
triumph. Readers who are able to cultivate these qualities will find themselves 
richly rewarded, both in their study of Number Theory and their appreciation of all 
that life has to offer. 
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Changes in the Fourth Edition 


There are a number of major changes in the fourth edition. 


e There is a new chapter on mathematical induction (Chapter 26). 


e Some material on proof by contradiction has been moved forward to Chap- 
ter 8. It is used in the proof that a polynomial of degree d has at most d 
roots modulo p. This fact is then used in place of primitive roots as a tool to 
prove Euler’s quadratic residue formula in Chapter 21. (In earlier editions, 
primitive roots were used for this proof.) 


e The chapters on primitive roots (Chapters 28—29) have been moved to follow 
the chapters on quadratic reciprocity and sums of squares (Chapters 20-25). 
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The rationale for this change is the author’s experience that students find the 
Primitive Root Theorem to be among the most difficult in the book. The new 
order allows the instructor to cover quadratic reciprocity first, and to omit 
primitive roots entirely if desired. 


Chapter 22 now includes a proof of part of quadratic reciprocity for Jacobi 
symbols, with the remaining parts included as exercises. 


Quadratic reciprocity is now proved in full. The proofs for (=) and (2) 
remain as before in Chapter 21, and there is a new chapter (Chapter 23) that 
gives Eisenstein’s proof for (F) GG) Chapter 23 is significantly more difficult 
than the chapters that precede it, and it may be omitted without affecting the 
subsequent chapters. 


As an application of primitive roots, Chapter 28 discusses the construction 
of Costas arrays. 

Chapter 39 includes a proof that the period of the Fibonacci sequence mod- 
ulo p divides p — 1 when p is congruent to 1 or 4 modulo 5. 

There are many new exercises scattered throughout the text. 

A flowchart giving chapter dependencies is included on page ix. 

Number theory is a vast and sprawling subject, and over the years this book 
has acquired many new chapters. In order to keep the length of this edition 


to a reasonable size, Chapters 47-50 have been removed from the printed 
version of the book. These omitted chapters are freely available online at 


http://www.math.brown.edu/~jhs/frint.html 
http://www.pearsonhighered.com/mathstatsresources 


The online chapters are included in the index. 
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All the people listed above have helped me to correct numerous mistakes and to 
greatly refine the exposition, but no book is ever free from error or incapable of 
being improved. I would be delighted to receive comments, good or bad, and 
corrections from my readers. You can send mail to me at 


jhs@math.brown.edu 


Additional material, including extra chapters, an errata sheet, links to interesting 
number theoretic sites, and downloadable versions of various computer exercises, 
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Introduction 


Fuclid alone 
Has looked on Beauty bare. Fortunate they 
Who, though once only and then but far away, 
Have heard her massive sandal set on stone. 


Edna St. Vincent Millay (1923) 


The origins of the natural numbers 1, 2, 3, 4, 5, 6, ... are lost in the mists of 
time. We have no knowledge of who first realized that there is a certain concept of 
““‘threeness” that applies equally well to three rocks, three stars, and three people. 
From the very beginnings of recorded history, numbers have inspired an endless 
fascination—mystical, aesthetic, and practical as well. It is not just the numbers 
themselves, of course, that command attention. Far more intriguing are the rela- 
tionships that numbers exhibit, one with another. It is within these profound and 
often subtle relationships that one finds the Beauty! so strikingly described in Edna 
St. Vincent Millay’s poem. Here is another description by a celebrated twentieth- 
century philosopher. 


Mathematics, rightly viewed, possesses not only truth, but supreme 
beauty—a beauty cold and austere, like that of sculpture, without ap- 
peal to any part of our weaker nature, without the gorgeous trappings 
of paintings or music, yet sublimely pure, and capable of a stern per- 
fection such as only the greatest art can show. (Bertrand Russell, 1902) 


The Theory of Numbers is that area of mathematics whose aim is to uncover 
the many deep and subtle relationships among different sorts of numbers. To take 
a simple example, many people through the ages have been intrigued by the square 
numbers 1, 4, 9, 16, 25, .... If we perform the experiment of adding together pairs 


"Euclid, indeed, has looked on Beauty bare, and not merely the beauty of geometry that most 
people associate with his name. Number theory is prominently featured in Books VII, VIII, and IX 
of Euclid’s famous Elements. 
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of square numbers, we will find that occasionally we get another square. The most 
famous example of this phenomenon is 


a 4-4? = 5%, 
but there are many others, such as 
Ss 12? = 137, 207+ 217 = 207, 987 +457 — 53”. 


Triples like (3, 4,5), (5, 12, 13), (20, 21, 29), and (28, 45, 53) have been given the 
name Pythagorean triples. Based on this experiment, anyone with a lively curiosity 
is bound to pose various questions, such as “Are there infinitely many Pythagorean 
triples?” and “If so, can we find a formula that describes all of them?” These are 
the sorts of questions dealt with by number theory. 

As another example, consider the problem of finding the remainder when the 


huge number 
39.4°78543743921429837645 


is divided by 54817263. Here’s one way to solve this problem. Take the number 
32478543, multiply it by itself 743921429837645 times, use long division to di- 
vide by 54817263, and take the remainder. In principle, this method will work, 
but in practice it would take far longer than a lifetime, even on the world’s fastest 
computers. Number theory provides a means for solving this problem, too. “Wait a 
minute,” I hear you say, “Pythagorean triples have a certain elegance that is pleas- 
ing to the eye, but where is the beauty in long division and remainders?” The 
answer is not in the remainders themselves, but in the use to which such remain- 
ders can be put. In a striking turn of events, mathematicians have shown how the 
solution of this elementary remainder problem (and its inverse) leads to the cre- 
ation of simple codes that are so secure that even the National Security Agency? 
is unable to break them. So much for G.H. Hardy’s singularly unprophetic remark 
that “no one has yet discovered any warlike purpose to be served by the theory of 
numbers or relativity, and it seems very unlikely that anyone will do so for many 
years.” 

The land of Number Theory is populated by a variety of exotic flora and fauna. 
There are square numbers and prime numbers and odd numbers and perfect num- 
bers (but no square-prime numbers and, as far as anyone knows, no odd-perfect 
numbers). There are Fermat equations and Pell equations, Pythagorean triples and 


The National Security Agency (NSA) is the arm of the United States government charged with 
data collection, code making, and code breaking. The NSA, with a budget larger than that of the 
CIA, is supposedly the single largest employer of mathematicians in the world. 

34 Mathematician’s Apology, §28, G.H. Hardy, Camb. Univ. Press, 1940. 
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elliptic curves, Fibonacci’s rabbits, unbreakable codes, and much, much more. You 
will meet all these creatures, and many others, as we journey through the Theory 
of Numbers. 


Guide for the Instructor 


This book is designed to be used as a text for a one-semester or full-year course 
in undergraduate number theory or for an independent study or reading course. 
It contains approximately two semesters’ worth of material, so the instructor of a 
one-semester course will have some flexibility in the choice of topics. The first 11 
chapters are basic, and probably most instructors will want to continue through 
the RSA cryptosystem in Chapter 18, since in my experience this is one of the 
students’ favorite topics. 

There are now many ways to proceed. Here are a few possibilities that seem to 
fit comfortably into one semester, but feel free to slice-and-dice the later chapters 
to fit your own tastes. 


Chapters 20-26, 31-34, and 47-48. Quadratic Reciprocity, sums of squares, in- 
duction, Pell’s equation, Diophantine approximation, and continued frac- 
tions. 


Chapters 30-34 and 41-46. Fermat’s equation for exponent 4, Pell’s equation, 
Diophantine approximation, elliptic curves, and Fermat’s Last Theorem. 


Chapters 26, 31-39 and 47-48. Induction, Pell’s equation, Diophantine approx- 
imation, Gaussian integers, transcendental numbers, binomial coefficients, 
linear recurrences, and continued fractions. 


Chapters 19-22, 26-29, and 38-40. Primality testing, quadratic reciprocity, in- 
duction, primitive roots, binomial coefficients, linear recurrences, big-Oh 
notation. (This syllabus is designed in particular for students planning fur- 
ther work in computer science or cryptography.) 


In any case, a good final project is to have the students read a few of the omitted 
chapters and do the exercises. 

Most of the nonnumerical nonprogramming exercises in this book are designed 
to foster discussion and experimentation. They do not necessarily have “correct” 
or “complete” answers. Many students will find this extremely disconcerting at 
first, so it must be stressed repeatedly. You can make your students feel more at 
ease by prefacing such questions with the phrase “Tell me as much as you can 
about ....” Tell your students that accumulating data and solving special cases are 
not merely acceptable, but encouraged. On the other hand, tell them that there is 
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no such thing as a complete solution, since the solution of a good problem always 
raises additional questions. So if they can fully answer the specific question given 
in the text, their next task is to look for generalizations and for limitations on the 
validity of their solution. 

Aside from a few clearly marked exercises, calculus is required only in two late 
chapters (Big-Oh notation in Chapter 40 and Generating Functions in Chapter 49). 
If the class has not taken calculus, these chapters may be omitted with no harm to 
the flow of the material. 

Number theory is not easy, so there’s no point in trying to convince the stu- 
dents that it is. Instead, this book will show your students that they are capable of 
mastering a difficult subject and experiencing the intense satisfaction of intellectual 
discovery. Your reward as the instructor is to bask in the glow of their endeavors. 


Computers, Number Theory, and This Book 


At this point I would like to say a few words about the use of computers in con- 
junction with this book. I neither expect nor desire that the reader make use of a 
high-level computer package such as Maple, Mathematica, PARI, or Derive, and 
most exercises (except as otherwise indicated) can be done with a simple pocket 
calculator. To take a concrete example, studying greatest common divisors (Chap- 
ter 5) by typing GCD [M, N] into acomputer is akin to studying electronics by turn- 
ing on a television set. Admittedly, computers allow one to do examples with large 
numbers, and you will find such computer-generated examples scattered through 
the text, but our ultimate goal is always to understand concepts and relationships. 
So if I were forced to make a firm ruling, yea or nay, regarding computers, I would 
undoubtedly forbid their use. 

However, just as with any good rule, certain exceptions will be admitted. First, 
one of the best ways to understand a subject is to explain it to someone else; so if 
you know a little bit of how to write computer programs, you will find it extremely 
enlightening to explain to a computer how to perform the algorithms described 
in this book. In other words, don’t rely on a canned computer package; do the 
programming yourself. Good candidates for such treatment are the Euclidean al- 
gorithm (Chapters 5—6), the RSA cryptosystem (Chapters 16-18), primality testing 
(Chapter 19), Quadratic Reciprocity (Chapter 22), writing numbers as sums of two 
squares (Chapters 24—25), continued fractions and solving Pell’s equation (Chap- 
ters 47—48), and generating rational points on elliptic curves (Chapter 41). 

The second exception to the “no computer rule” is generation of data. Dis- 
covery in number theory is usually based on experimentation, which may involve 
examining reams of data to try to distinguish underlying patterns. Computers are 
well suited to generating such data and also sometimes to assist in searching for 
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patterns, and I have no objection to their being used for these purposes. 

I have included a number of computer exercises and computer projects to en- 
courage you to use computers properly as tools to help understand and investigate 
the theory of numbers. Some of these exercises can be implemented on a small 
computer (or even a programmable calculator), while others require more sophis- 
ticated machines and/or programming languages. Exercises and projects requiring 
a computer are marked by the symbol 2. 

For many of the projects I have not given a precise formulation, since part of 
the project is to decide exactly what the user should input and exactly what form 
the output should take. Note that a good computer program must include all the 
following features: 


e Clearly written documentation explaining what the program does, how to use 
it, what quantities it takes as input, and what quantities it returns as output. 


e Extensive internal comments explaining how the program works. 


e Complete error handling with informative error messages. For example, if 
a = b = O, then the gcd(a,b) routine should return the error message 
“gcd(0,0) is undefined” instead of going into an infinite loop or 
returning a “division by zero” error. 


As you write your own programs, try to make them user friendly and as versatile 
as possible, since ultimately you will want to link the pieces together to form your 
own package of number theoretic routines. 

The moral is that computers are useful as a tool for experimentation and that 
you can learn a lot by teaching a computer how to perform number theoretic calcu- 
lations, but when you are first learning a subject, a prepackaged computer program 
merely provides a crutch that prevents you from learning to walk on your own. 


Chapter 1 


What Is Number Theory? 


Number theory is the study of the set of positive whole numbers 
Te, Oso Os bana, 


which are often called the set of natural numbers. We will especially want to study 
the relationships between different sorts of numbers. Since ancient times, people 
have separated the natural numbers into a variety of different types. Here are some 
familiar and not-so-familiar examples: 


odd i rors sn te! ne gt 

even De AO Gales. 5 

square 14.29; 16; 25.36;..2° 

cube LO 227, 04 wl Zone 

prime Popeater be ft RGR IAS os a eh G Ria 


composite CMS peor et Fi da i en Os Ls eee 

1 (modulo 4) 1,5,9, 13,17, 21, 25,... 

> (modulo'4) 3,7, 11, 25,19) 2327 363 

triangular Hoe a area 0 alles 07 a ee 

perfect 6326,,406,242 

Fibonacci dlg2y os Oe Or lowe aan 

Many of these types of numbers are undoubtedly already known to you. Oth- 

ers, such as the “modulo 4” numbers, may not be familiar. A number is said to be 
congruent to 1 (modulo 4) if it leaves a remainder of 1 when divided by 4, and sim- 
ilarly for the 3 (modulo 4) numbers. A number is called triangular if that number 
of pebbles can be arranged in a triangle, with one pebble at the top, two pebbles 
in the next row, and so on. The Fibonacci numbers are created by starting with 1 
and 1. Then, to get the next number in the list, just add the previous two. Finally, a 
number is perfect if the sum of all its divisors, other than itself, adds back up to the 
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original number. Thus, the numbers dividing 6 are 1, 2, and 3, and1+2+3=6. 
Similarly, the divisors of 28 are 1, 2, 4, 7, and 14, and 


14+2+4+7+14= 28. 


We will encounter all these types of numbers, and many others, in our excursion 
through the Theory of Numbers. 


Some Typical Number Theoretic Questions 


The main goal of number theory is to discover interesting and unexpected rela- 
tionships between different sorts of numbers and to prove that these relationships 
are true. In this section we will describe a few typical number theoretic problems, 
some of which we will eventually solve, some of which have known solutions too 
difficult for us to include, and some of which remain unsolved to this day. 


Sums of Squares I. Can the sum of two squares be a square? The answer is 
clearly “YES”; for example 3? + 4? = 5? and 57 + 12? = 13%. These are 
examples of Pythagorean triples. We will describe all Pythagorean triples in 
Chapter 2. 


Sums of Higher Powers. Can the sum of two cubes be a cube? Can the sum 
of two fourth powers be a fourth power? In general, can the sum of two 
n® powers be an n" power? The answer is “NO.” This famous problem, 
called Fermat’s Last Theorem, was first posed by Pierre de Fermat in the 
seventeenth century, but was not completely solved until 1994 by Andrew 
Wiles. Wiles’s proof uses sophisticated mathematical techniques that we 
will not be able to describe in detail, but in Chapter 30 we will prove that 
no fourth power is a sum of two fourth powers, and in Chapter 46 we will 
sketch some of the ideas that go into Wiles’s proof. 


Infinitude of Primes. A prime number is a number p whose only factors are 1 
and p. 
e Are there infinitely many prime numbers? 
e Are there infinitely many primes that are 1 modulo 4 numbers? 
e Are there infinitely many primes that are 3 modulo 4 numbers? 
The answer to all these questions is “YES.” We will prove these facts in 


Chapters 12 and 21 and also discuss a much more general result proved by 
Lejeune Dirichlet in 1837. 
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Sums of Squares II. Which numbers are sums of two squares? It often turns out 
that questions of this sort are easier to answer first for primes, so we ask 
which (odd) prime numbers are a sum of two squares. For example, 


3 = NO, 5 = 17422, 7 = NO, 11 = NO, 
13 = 2? + 32, 17 = 12 + 4?, 19 = NO, 23 = NO, 
29 = 27457, 31=NO, 37 = 12 + &, 


Do you see a pattern? Possibly not, since this is only a short list, but a longer 
list leads to the conjecture that p is a sum of two squares if it is congruent 
to 1 (modulo 4). In other words, p is a sum of two squares if it leaves a 
remainder of 1 when divided by 4, and it is not a sum of two squares if it 
leaves a remainder of 3. We will prove that this is true in Chapter 24. 


Number Shapes. The square numbers are the numbers 1, 4, 9, 16, ... that can 
be arranged in the shape of a square. The triangular numbers are the num- 
bers 1, 3, 6, 10, ... that can be arranged in the shape of a triangle. The first 
few triangular and square numbers are illustrated in Figure 1.1. 


eee @ 
L235 Le 2 3 = 6 14+24+3+4=10 
Triangular numbers 


e@@e @ e@oe3e8 @ 

ee @ e@oee8 @ 

e@@e@ e@oee8 @ 

e@oe3e8 @ 

92 = 4 32 = 9 42 = 16 


Square numbers 


Figure 1.1: Numbers That Form Interesting Shapes 


A natural question to ask is whether there are any triangular numbers that 
are also square numbers (other than 1). The answer is “YES,” the smallest 
example being 


86 = 6S 14? Ot GE Pe: 


So we might ask whether there are more examples and, if so, are there in- 
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finitely many? To search for examples, the following formula is helpful: 


n(n +1) 


Pps 


There is an amusing anecdote associated with this formula. One day 
when the young Carl Friedrich Gauss (1777-1855) was in grade school, 
his teacher became so incensed with the class that he set them the task 
of adding up all the numbers from 1 to 100. As Gauss’s classmates 
dutifully began to add, Gauss walked up to the teacher and presented the 
answer, 5050. The story goes that the teacher was neither impressed nor 
amused, but there’s no record of what the next make-work assignment 
was! 


There is an easy geometric way to verify Gauss’s formula, which may be the 
way he discovered it himself. The idea is to take two triangles consisting of 
1+2-+----+ n pebbles and fit them together with one additional diagonal 
of n + 1 pebbles. Figure 1.2 illustrates this idea for n = 6. 


zi 

C- 1 
6 i. 
5 3 
4 4 
3 5 
2 6 


i 


(14+2434+4454+6)+7+ (64544434241 =7 


Figure 1.2: The Sum of the First n Integers 


In the figure, we have marked the extra n + 1 = 7 pebbles on the diagonal 
with black dots. The resulting square has sides consisting of n + 1 pebbles, 
so in mathematical terms we obtain the formula 
211+24+34+-:-+n)+ (n+1) =(n+1)%, 
two triangles + diagonal = square. 
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Now we can subtract n + 1 from each side and divide by 2 to get Gauss’s 
formula. 


Twin Primes. In the list of primes it is sometimes true that consecutive odd num- 


bers are both prime. We have boxed these twin primes in the following list 
of primes less than 100: 


[3h(s}[7, L1}3} O7hl9) 23, [29}[31} 37 
[41},[43], 47,53, [59}[61] 67, [71][73] 79,83, 89,97. 
Are there infinitely many twin primes? That is, are there infinitely many 


prime numbers p such that p + 2 is also a prime? At present, no one knows 
the answer to this question. 


Primes of the Form N? + 1. If we list the numbers of the form N? + 1 taking 


N = 1,2,3,..., we find that some of them are prime. Of course, if N is 
odd, then N? + 1 is even, so it won’t be prime unless NV = 1. So it’s really 
only interesting to take even values of NV. We’ve highlighted the primes in 
the following list: 


aes mn ae 7g es 6741=37 8711=65=5-13 
107? +1=101 127s) = "45 5 00 147+1=197 
167+ 1= 257 187 +1 = 325 = 57-13 207 + 1 = 401. 


It looks like there are quite a few prime values, but if you take larger values 
of N you will find that they become much rarer. So we ask whether there are 
infinitely many primes of the form N? + 1. Again, no one presently knows 
the answer to this question. 


We have now seen some of the types of questions that are studied in the Theory 
of Numbers. How does one attempt to answer these questions? The answer is that 
Number Theory is partly experimental and partly theoretical. The experimental 
part normally comes first; it leads to questions and suggests ways to answer them. 
The theoretical part follows; in this part one tries to devise an argument that gives 
a conclusive answer to the questions. In summary, here are the steps to follow: 


ile 
2 
Di 


Accumulate data, usually numerical, but sometimes more abstract in nature. 
Examine the data and try to find patterns and relationships. 

Formulate conjectures (i.e., guesses) that explain the patterns and relation- 
ships. These are frequently given by formulas. 
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4. Test your conjectures by collecting additional data and checking whether the 
new information fits your conjectures. 
5. Devise an argument (i.e., a proof) that your conjectures are correct. 


All five steps are important in number theory and in mathematics. More gener- 
ally, the scientific method always involves at least the first four steps. Be wary of 
any purported “scientist” who claims to have “proved” something using only the 
first three. Given any collection of data, it’s generally not too difficult to devise 
numerous explanations. The true test of a scientific theory is its ability to predict 
the outcome of experiments that have not yet taken place. In other words, a scien- 
tific theory only becomes plausible when it has been tested against new data. This 
is true of all real science. In mathematics one requires the further step of a proof, 
that is, a logical sequence of assertions, starting from known facts and ending at 
the desired statement. 


Exercises 


1.1. The first two numbers that are both squares and triangles are 1 and 36. Find the 
next one and, if possible, the one after that. Can you figure out an efficient way to find 
triangular-square numbers? Do you think that there are infinitely many? 


1.2. Try adding up the first few odd numbers and see if the numbers you get satisfy some 
sort of pattern. Once you find the pattern, express it as a formula. Give a geometric 
verification that your formula is correct. 


1.3. The consecutive odd numbers 3, 5, and 7 are all primes. Are there infinitely many 
such “prime triplets”? That is, are there infinitely many prime numbers p such that p + 2 
and p + 4 are also primes? 


1.4. It is generally believed that infinitely many primes have the form N2 + 1, although 
no one knows for sure. 

(a) Do you think that there are infinitely many primes of the form N? — 1? 

(b) Do you think that there are infinitely many primes of the form N2 — 2? 

(c) How about of the form N? — 3? How about N? — 4? 

(d) Which values of a do you think give infinitely many primes of the form N? — a? 


1.5. The following two lines indicate another way to derive the formula for the sum of the 
first n integers by rearranging the terms in the sum. Fill in the details. 
14+24+3+---+n=(1+n)+(2+(n—-1))+(34+(n—2))+-- 
=(l+n)4+(1l4+n)+(14n)+-:-:-. 
How many copies of n + 1 are in there in the second line? You may need to consider the 


cases of odd n and even n separately. If that’s not clear, first try writing it out explicitly for 
n= 6-and n= T: 
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1.6. For each of the following statements, fill in the blank with an easy-to-check crite- 
rion: 
(a) M is a triangular number if and only if is an odd square. 
(b) JN is an odd square if and only if is a triangular number. 
(c) Prove that your criteria in (a) and (b) are correct. 


Chapter 2 


Pythagorean Triples 


The Pythagorean Theorem, that “beloved” formula of all high school geometry 
students, says that the sum of the squares of the sides of a right triangle equals the 
square of the hypotenuse. In symbols, 


ee ee b 


Figure 2.1: A Pythagorean Triangle 


Since we’re interested in number theory, that is, the theory of the natural num- 
bers, we will ask whether there are any Pythagorean triangles all of whose sides are 
natural numbers. There are many such triangles. The most famous has sides 3, 4, 
and 5. Here are the first few examples: 


BeAr Be. be 1 ee Re 17. BAe 587, 


The study of these Pythagorean triples began long before the time of Pythago- 
ras. There are Babylonian tablets that contain lists of parts of such triples, including 
quite large ones, indicating that the Babylonians probably had a systematic method 
for producing them. Even more amazing is the fact that the Babylonians may have 
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used their lists of Pythagorean triples as primitive trigonometric tables. Pythago- 
rean triples were also used in ancient Egypt. For example, a rough-and-ready way 
to produce a right angle is to take a piece of string, mark it into 12 equal segments, 
tie it into a loop, and hold it taut in the form of a 3-4-5 triangle, as illustrated in Fig- 
ure 2.2. This provides an inexpensive right angle tool for use on small construction 
projects (such as marking property boundaries or building pyramids). 


String with 12 knots String pulled taut 


Figure 2.2: Using a knotted string to create a right triangle 


The Babylonians and Egyptians had practical reasons for studying Pythagor- 
ean triples. Do such practical reasons still exist? For this particular problem, the 
answer is “probably not.” However, there is at least one good reason to study 
Pythagorean triples, and it’s the same reason why it is worthwhile studying the art 
of Rembrandt and the music of Beethoven. There is a beauty to the ways in which 
numbers interact with one another, just as there is a beauty in the composition of a 
painting or a symphony. To appreciate this beauty, one has to be willing to expend 
a certain amount of mental energy. But the end result is well worth the effort. Our 
goal in this book is to understand and appreciate some truly beautiful mathematics, 
to learn how this mathematics was discovered and proved, and maybe even to make 
some original contributions of our own. 

Enough blathering, you are undoubtedly thinking. Let’s get to the real stuff. 
Our first naive question is whether there are infinitely many Pythagorean triples, 
that is, triples of natural numbers (a, b, c) satisfying the equation a? + b* = c?. The 
answer is “YES” for a very silly reason. If we take a Pythagorean triple (a, b, c) 
and multiply it by some other number d, then we obtain a new Pythagorean triple 
(da, db, dc). This is true because 


(da)* + (db)? = d?(a? + b*) = d’c* = (dc)’. 


Clearly these new Pythagorean triples are not very interesting. So we will concen- 
trate our attention on triples with no common factors. We will even give them a 
name: 
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A primitive Pythagorean triple (or PPT for short) is a triple of num- 
bers (a,b,c) such that a, b, and c have no common factors! and 
satisfy 

a? +b%7=c?. 


Recall our checklist from Chapter 1. The first step is to accumulate some data. 
I used a computer to substitute in values for a and b and checked if a? + b? is a 
square. Here are some primitive Pythagorean triples that I found: 


(B45). “CB. SI 724205), 
(20,21,29), (9,40,41), (12,35,37), (11,60,61), 
(28, 45,53), (33,56,65), (16,63, 65). 


A few conclusions can easily be drawn even from such a short list. For example, it 
certainly looks like one of a and 0 is odd and the other even. It also seems that c is 
always odd. 

It’s not hard to prove that these conjectures are correct. First, if a and 6 are both 
even, then c would also be even. This means that a, b, and c would have a common 
factor of 2, so the triple would not be primitive. Next, suppose that a and 6 are 
both odd, which means that c would have to be even. This means that there are 
numbers 2, y, and z such that 


a= 27-1, b= 2y+1, and C= a: 
We can substitute these into the equation a? + b? = c? to get 


(2a + 1)? + (2y +1)? = (22)?, 
4a? + dx + 4y? + 4y + 2 = 42”. 
Now divide by 2, 
Qo? 4. Og +4? + Oy +1 = 227. 
This last equation says that an odd number is equal to an even number, which is 


impossible, so a and 6 cannot both be odd. Since we’ve just checked that they 
cannot both be even and cannot both be odd, it must be true that one is even and 


‘A common factor of a, b, and cis a number d such that each of a, b, and c is a multiple of d . For 
example, 3 is acommon factor of 30, 42, and 105, since 30 = 3 - 10, 42 = 3- 14, and 105 = 3 - 35, 
and indeed it is their largest common factor. On the other hand, the numbers 10, 12, and 15 have 
no common factor (other than 1). Since our goal in this chapter is to explore some interesting and 
beautiful number theory without getting bogged down in formalities, we will use common factors 
and divisibility informally and trust our intuition. In Chapter 5 we will return to these questions and 
develop the theory of divisibility more carefully. 
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the other is odd. It’s then obvious from the equation a? + b? = c? that c is also 
odd. 

We can always switch a and 6, so our problem now is to find all solutions in 
natural numbers to the equation 


a odd, 
a . 
a“ +b°=c with b even, 
a, b, c having no common factors. 


The tools that we use are factorization and divisibility. 
Our first observation is that if (a, b, c) is a primitive Pythagorean triple, then 
we can factor 
a* = c* — b? = (c— b)(c +d). 


Here are a few examples from the list given earlier, where note that we always 
take a to be odd and 6 to be even: 


3? = 527 — 47 = (5 -— 4)(544) =1-9, 
15? = 172 — 8? = (17 — 8)(17 + 8) =9 - 25, 
357 = 37° = 127 = 87 = 12)(37 + 12) = 25. 49, 
337 = 65? — 562 = (65 — 56)(65 + 56) = 9-121. 


It looks like c — 6 and c + b are themselves always squares. We check this obser- 
vation with a couple more examples: 


217 = 297 — 20? = (29 — 20)(29 + 20) = 9 - 49, 
637 = 657 — 16” = (65 — 16)(65 + 16) = 49 - 81. 


How can we prove that c — b and c + b are squares? Another observation ap- 
parent from our list of examples is that c — b and c+ b seem to have no common 
factors. We can prove this last assertion as follows. Suppose that d is a common 
factor of c — band c + ); that is, d divides both c — bandc + b. Then d also divides 


(c+ b) + (c— b) = 2c and (c+ b) — (c— b) = 2b. 


Thus, d divides 2b and 2c. But b and c have no common factor because we are 
assuming that (a, b,c) is a primitive Pythagorean triple. So d must equal 1 or 2. 
But d also divides (c — b)(c + 6) = a”, and a is odd, so d must be 1. In other 
words, the only number dividing both c — b andc + bis 1, soc — bandc+ b have 
no common factor. 
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We now know that c — b and c+ 0 are positive integers having no common 
factor, that their product is a square since (c — b)(c + b) = a?. The only way that 
this can happen is if c — b and c + b are themselves squares.” So we can write 


c+b=s? and c—b=??, 


where s > t > 1 are odd integers with no common factors. Solving these two 
equations for b and c yields 
s?7 +t? Ce re 


; and b= ta 


and then 
a= V/(c— b)(c+ 5d) = st. 


We have (almost) finished our first proof! The following theorem records our 
accomplishment. 


Theorem 2.1 (Pythagorean Triples Theorem). We will get every primitive Pytha- 
gorean triple (a, b,c) with a odd and b even by using the formulas 
ee 4k 


= st — = 
a=st, 6 oa a 5 


where s > t > 1 are chosen to be any odd integers with no common factors. 


Why did we say that we have “almost” finished the proof? We have shown 
that if (a, b,c) is a PPT with a odd, then there are odd integers s > t > 1 with 
no common factors so that a, b, and c are given by the stated formulas. But we 
still need to check that these formulas always give a PPT. We first use a little bit of 
algebra to show that the formulas give a Pythagorean triple. Thus 


. (-# oan 2,2 st— 2877? +t4 9 544257? 404 | s? + 42)? 
(st)“+ 5 Se — a 3 ; 


We also need to check that st, at, and ate have no common factors. This 
is most easily accomplished using an important property of prime numbers, so 
we postpone the proof until Chapter 7, where you will finish the argument (Exer- 
cise 7.3). 


*This is intuitively clear if you consider the factorization of c — b and c + b into primes, since 
the primes in the factorization of c — b will be distinct from the primes in the factorization of c + b. 
However, the existence and uniqueness of the factorization into primes is by no means as obvious as 
it appears. We will discuss this further in Chapter 7. 
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For example, taking t = 1 in Theorem 2.1 gives a triple (s, s*=1 sth) 


2 2 
whose 6 and c entries differ by 1. This explains many of the examples that we 
listed. The following table gives all possible triples with s < 9. 


wW 


wConNtInNtoaonaw 
Now WrRrP PP 


A Notational Interlude 


Mathematicians have created certain standard notations as a shorthand for various 
quantities. We will keep our use of such notation to a minimum, but there are a 
few symbols that are so commonly used and are so useful that it is worthwhile to 
introduce them here. They are 


N = the set of natural numbers = 1, 2,3,4,... , 
Z = the set of integers = ... — 3, —2, —1,0,1,2,3,..., 


Q = the set of rational numbers (i.e., fractions). 


In addition, mathematicians often use R to denote the real numbers and C for the 
complex numbers, but we will not need these. Why were these letters chosen? 
The choice of N, R, and C needs no explanation. The letter Z for the set of inte- 
gers comes from the German word “Zahlen,” which means numbers. Similarly, Q 
comes from the German “Quotient” (which is the same as the English word). We 
will also use the standard mathematical symbol € to mean “is an element of the 
set.” So, for example, a € N means that a is a natural number, and x € Q means 
that x is a rational number. 


Exercises 


2.1. (a) We showed that in any primitive Pythagorean triple (a, b, c), either a or b is even. 
Use the same sort of argument to show that either a or b must be a multiple of 3. 
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(b) By examining the above list of primitive Pythagorean triples, make a guess about 
when a, b, or cis a multiple of 5. Try to show that your guess is correct. 


2.2. A nonzero integer d is said to divide an integer m if m = dk for some number k. 
Show that if d divides both m and n, then d also divides m — n and m+n. 


2.3. For each of the following questions, begin by compiling some data; next examine the 
data and formulate a conjecture; and finally try to prove that your conjecture is correct. (But 
don’t worry if you can’t solve every part of this problem; some parts are quite difficult.) 

(a) Which odd numbers a can appear in a primitive Pythagorean triple (a, b, c)? 

(b) Which even numbers 6 can appear in a primitive Pythagorean triple (a, b, c)? 

(c) Which numbers c can appear in a primitive Pythagorean triple (a, b,c)? 


2.4. In our list of examples are the two primitive Pythagorean triples 
337+567=657 and 16?+463? = 65. 


Find at least one more example of two primitive Pythagorean triples with the same value 
of c. Can you find three primitive Pythagorean triples with the same c? Can you find more 
than three? 


2.5. In Chapter 1 we saw that the n™ triangular number T,, is given by the formula 


n(n + 1) 
a 


The first few triangular numbers are 1, 3, 6, and 10. In the list of the first few Pythagorean 
triples (a, b,c), we find (3, 4, 5), (5, 12, 13), (7, 24, 25), and (9, 40, 41). Notice that in each 
case, the value of b is four times a triangular number. 
(a) Find a primitive Pythagorean triple (a, b,c) with b = 4T;. Do the same for b = 4T¢ 
and for b = 477. 
(b) Do you think that for every triangular number 7;,, there is a primitive Pythagorean 
triple (a, b,c) with b = 4T,,? If you believe that this is true, then prove it. Otherwise, 
find some triangular number for which it is not true. 


Tr =14+24+34+---+n= 


2.6. If you look at the table of primitive Pythagorean triples in this chapter, you will see 
many triples in which c is 2 greater than a. For example, the triples (3, 4,5), (15,8, 17), 
(35, 12, 37), and (63, 16, 65) all have this property. 
(a) Find two more primitive Pythagorean triples (a, b,c) having c = a + 2. 
(b) Find a primitive Pythagorean triple (a, b,c) having c = a+ 2 andc > 1000. 
(c) Try to find a formula that describes all primitive Pythagorean triples (a, b,c) having 
C=O 2: 


2.7. For each primitive Pythagorean triple (a, b, c) in the table in this chapter, compute the 
quantity 2c — 2a. Do these values seem to have some special form? Try to prove that your 
observation is true for all primitive Pythagorean triples. 


2.8. Let m and n be numbers that differ by 2, and write the sum aa + 4 as a fraction in 
lowest terms. For example, $ + 4 = 3 and3+4=4. 
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(a) Compute the next three examples. 

(b) Examine the numerators and denominators of the fractions in (a) and compare them 
with the table of Pythagorean triples on page 18. Formulate a conjecture about such 
fractions. 

(c) Prove that your conjecture is correct. 


2.9. (a) Read about the Babylonian number system and write a short description, includ- 
ing the symbols for the numbers 1 to 10 and the multiples of 10 from 20 to 50. 

(b) Read about the Babylonian tablet called Plimpton 322 and write a brief report, in- 
cluding its approximate date of origin. 

(c) The second and third columns of Plimpton 322 give pairs of integers (a,c) having 
the property that c? — a? is a perfect square. Convert some of these pairs from Baby- 
lonian numbers to decimal numbers and compute the value of 6 so that (a,b,c) is a 
Pythagorean triple. 


Chapter 3 


Pythagorean Triples 
and the Unit Circle 


In the previous chapter we described all solutions to 
a? +b7 =? 


in whole numbers a, 0, c. If we divide this equation by c?, we obtain 


So the pair of rational numbers (a/c, b/c) is a solution to the equation 
ae eye a lh 


Everyone knows what the equation x* + y* = 1 looks like: It is a circle C' of 
radius 1 with center at (0,0). We are going to use the geometry of the circle C to 
find all the points on C' whose xy-coordinates are rational numbers. Notice that 
the circle has four obvious points with rational coordinates, (+1,0) and (0,+1). 
Suppose that we take any (rational) number m and look at the line Z going through 
the point (—1, 0) and having slope m. (See Figure 3.1.) The line L is given by the 
equation 

ey an eel) (point-slope formula). 


It is clear from the picture that the intersection C'N L consists of exactly two points, 
and one of those points is (—1, 0). We want to find the other one. 
To find the intersection of C' and L, we need to solve the equations 


gt+y*=1 and y= m(a +1) 
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L =line with 
slope m 


Figure 3.1: The Intersection of a Circle and a Line 


for x and y. Substituting the second equation into the first and simplifying, we 
need to solve 


x + (m(a+1))?=1 

a? +m? (22 4+2r4+1)=1 

(m? + 1)z? + 2m2a + (m? — 1) =0. 
This is just a quadratic equation, so we could use the quadratic formula to solve 
for x. But there is a much easier way to find the solution. We know that z = —1 


must be a solution, since the point (—1, 0) is on both C' and L. This means that we 
can divide the quadratic polynomial by x + 1 to find the other root: 


(m? + 1)z + (m2 — 1) 
t+1)(m? + 1)2? + 2m2a + (m? — 1). 


So the other root is the solution of (m? + 1)x + (m? — 1) = 0, which means 
that 
a m? 
Tem 
Then we substitute this value of x into the equation y = m(z + 1) of the line L to 
find the y-coordinate, 


ae ee 1—m? _— 2m 
a = 1+ m? 1+ m2" 
Thus, for every rational number m we get a solution in rational numbers 


(= 2m 


eae. —) to the equation a+y*=1. 
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On the other hand, if we have a solution (x, y;) in rational numbers, then the 
slope of the line through (71, y,) and (—1,0) will be a rational number. So by 
taking all possible values for m, the process we have described will yield every so- 
lution to x? + y? = 1 in rational numbers [except for (—1,0), which corresponds 
to a vertical line having slope “m = oo” ]. We summarize our results in the follow- 
ing theorem. 


Theorem 3.1. Every point on the circle 
e+y?=l 


whose coordinates are rational numbers can be obtained from the formula 


by substituting in rational numbers for m [except for the point (—1,0)which is the 
limiting value as m — ov]. 


How is this formula for rational points on a circle related to our formula for 
Pythagorean triples? If we write the rational number m as a fraction v/u, then our 


formula becomes 
u2 — v? 2uv 
8 (es Oe 
Y u2 + v2’ u2 + v2 


and clearing denominators gives the Pythagorean triple 
(a,b,c) = (u2 — v*, 2uv, u? + v?). 


This is another way of describing Pythagorean triples, although to describe only 
the primitive ones would require some restrictions on u and v. You can relate this 
description to the formula in Chapter 2 by setting 


Eom and Feces 
io Oe 


Exercises 


3.1. As we have just seen, we get every Pythagorean triple (a, b,c) with b even from the 
formula 
(a,b,c) = (u? — v*, 2uv, u? + v?) 


by substituting in different integers for u and v. For example, (u,v) = (2,1) gives the 
smallest triple (3, 4,5). 
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(a) If u and v have a common factor, explain why (a, b, c) will not be a primitive Pytha- 
gorean triple. 

(b) Find an example of integers u > v > O that do not have a common factor, yet the 
Pythagorean triple (u? — v?, 2uv, u? + v2) is not primitive. 

(c) Make a table of the Pythagorean triples that arise when you substitute in all values 
of u and v with 1 <v <u < 10. 

(d) Using your table from (c), find some simple conditions on u and v that ensure that 
the Pythagorean triple (u? — v?, 2uv, u? + v?) is primitive. 

(e) Prove that your conditions in (d) really work. 


3.2. (a) Use the lines through the point (1, 1) to describe all the points on the circle 
a? ty? =2 


whose coordinates are rational numbers. 
(b) What goes wrong if you try to apply the same procedure to find all the points on the 
circle x? + y? = 3 with rational coordinates? 


3.3. Find a formula for all the points on the hyperbola 
x? =, y” =| 


whose coordinates are rational numbers. [Hint. Take the line through the point (—1, 0) 
having rational slope m and find a formula in terms of m for the second point where the 
line intersects the hyperbola. ] 


3.4. The curve 
y" = xr 438 


contains the points (1, —3) and (—7/4, 13/8). The line through these two points intersects 
the curve in exactly one other point. Find this third point. Can you explain why the 
coordinates of this third point are rational numbers? 


3.5. Numbers that are both square and triangular numbers were introduced in Chapter 1, 
and you studied them in Exercise 1.1. 

(a) Show that every square—triangular number can be described using the solutions in 
positive integers to the equation x? — 2y? = 1. [Hint. Rearrange the equation m? = 
$(n? +n).] 

(b) The curve x? — 2y? = 1 includes the point (1,0). Let L be the line through (1, 0) 
having slope m. Find the other point where L intersects the curve. 

(c) Suppose that you take m to equal m = v/u, where (u, v) is a solution to u2 — 2v? = 
1. Show that the other point that you found in (b) has integer coordinates. Further, 
changing the signs of the coordinates if necessary, show that you get a solution to 
x? — 2y” = 1 in positive integers. 

(d) Starting with the solution (3, 2) to x? — 2y? = 1, apply (b) and (c) repeatedly to find 
several more solutions to x” — 2y? = 1. Then use those solutions to find additional 
examples of square—-triangular numbers. 
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(e) Prove that this procedure leads to infinitely many different square-triangular numbers. 
(f) Prove that every square—triangular number can be constructed in this way. (This part 
is very difficult. Don’t worry if you can’t solve it.) 


Chapter 4 


Sums of Higher Powers 
and Fermat’s Last Theorem 


In the previous two chapters we discovered that the equation 
a+ =¢? 


has lots of solutions in whole numbers a, 0, c. It is natural to ask whether there are 
solutions when the exponent 2 is replaced by a higher power. For example, do the 
equations 


oy Se and Co Se and a+b =c 


have solutions in nonzero integers a, b,c? The answer is “NO.” Sometime around 
1637, Pierre de Fermat showed that there is no solution for exponent 4. During 
the eighteenth and nineteenth centuries, Carl Friedrich Gauss and Leonhard Euler 
showed that there is no solution for exponent 3 and Lejeune Dirichlet and Adrien 
Legendre dealt with the exponent 5. The general problem of showing that the 
equation 

a” +6" =c" 


has no solutions in positive integers if mn > 3 is known as “Fermat’s Last Theo- 
rem.” It has attained almost cult status in the 350 years since Fermat scribbled the 
following assertion in the margin of one of his books: 


It is impossible to separate a cube into two cubes, or a fourth power into two 
fourth powers, or in general any power higher than the second into powers of 
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like degree. I have discovered a truly remarkable proof which this margin is 
too small to contain.! 


Few mathematicians today believe that Fermat had a valid proof of his “The- 
orem,’ which is called his Last Theorem because it was the last of his assertions 
that remained unproved. The history of Fermat’s Last Theorem is fascinating, with 
literally hundreds of mathematicians making important contributions. Even a brief 
summary could easily fill a book. This is not our intent in this volume, so we will 
be content with a few brief remarks. 

One of the first general results on Fermat’s Last Theorem, as opposed to verifi- 
cation for specific exponents n, was given by Sophie Germain in 1823. She proved 
that if both p and 2p + 1 are primes then the equation a? + 6? = Cc? has no so- 
lutions in integers a, b,c with p not dividing the product abc. A later result of a 
similar nature, due to A. Wieferich in 1909, is that the same conclusion is true if 
the quantity 2? — 2 is not divisible by p?. Meanwhile, during the latter part of 
the nineteenth century a number of mathematicians, including Richard Dedekind, 
Leopold Kronecker, and especially Ernst Kummer, developed a new field of math- 
ematics called algebraic number theory and used their theory to prove Fermat’s 
Last Theorem for many exponents, although still only a finite list. Then, in 1985, 
L.M. Adleman, D.R. Heath-Brown, and E. Fouvry used a refinement of Germain’s 
criterion together with difficult analytic estimates to prove that there are infinitely 
many primes p such that a? + 6? = c? has no solutions with p not dividing abc. 


Sophie Germain (1776-1831) Sophie Germain was a French mathemati- 
cian who did important work in number theory and differential equations. 
She is best known for her work on Fermat’s Last Theorem, where she gave 
a simple criterion that suffices to show that the equation a? + b? = c? has 
no solutions with abc not divisible by p. She also did work on acoustics and 
elasticity, especially the theory of vibrating plates. As a mathematics student, 
she was forced to take correspondence courses from the Ecole Polytechnique 
in Paris, since they did not accept women as students. For a similar reason, 
she began her extensive correspondence with Gauss using the pseudonym 
Monsieur Le Blanc; but when she eventually revealed her identity, Gauss 
was delighted and sufficiently impressed with her work to recommend her 
for an honorary degree at the University of Gottingen. 


In 1986 Gerhard Frey suggested a new line of attack on Fermat’s problem using 
a notion called modularity. Frey’s idea was refined by Jean-Pierre Serre, and Ken 


‘Translated from the Latin: “Cubum autem in duos cubos, aut quadrato quadratum in duos 
quadrato quadratos, & generaliter nullam in infinitum ultra quadratum potestatem in duos ejusdem 
nominis fas est dividere; cujus rei demonstrationem mirabilem sane detexi. Hanc marginis exiguitas 
non caperet.” 
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Ribet subsequently proved that if the Modularity Conjecture is true, then Fermat’s 
Last Theorem is true. Precisely, Ribet proved that if every semistable elliptic curve” 
is modular’ then Fermat’s Last Theorem is true. The Modularity Conjecture, which 
asserts that every rational elliptic curve is modular, was at that time a conjecture 
originally formulated by Goro Shimura and Yutaka Taniyama. Finally, in 1994, 
Andrew Wiles announced a proof that every semistable rational elliptic curve is 
modular, thereby completing the proof of Fermat’s 350-year-old claim. Wiles’s 
proof, which is a tour de force using the vast machinery of modern number theory 
and algebraic geometry, is far too complicated for us to describe in detail, but we 
will try to convey the flavor of his proof in Chapter 46. 

Few mathematical or scientific discoveries arise in a vacuum. Even Sir Isaac 
Newton, the transcendent genius not noted for his modesty, wrote that “If I have 
seen further, it is by standing on the shoulders of giants.” Here is a list of some 
of the giants, all contemporary mathematicians, whose work either directly or in- 
directly contributed to Wiles’s brilliant proof. The diversified nationalities high- 
light the international character of modern mathematics. In alphabetical order: 
Spencer Bloch (USA), Henri Carayol (France), John Coates (Australia), Pierre 
Deligne (Belgium), Ehud de Shalit (Israel), Fred Diamond (USA), Gerd Falt- 
ings (Germany), Matthias Flach (Germany), Gerhard Frey (Germany), Alexander 
Grothendieck (France), Yves Hellegouarch (France), Haruzo Hida (Japan), Ken- 
kichi Iwasawa (Japan), Kazuya Kato (Japan), Nick Katz (USA), V.A. Kolyvagin 
(Russia), Ernst Kunz (Germany), Robert Langlands (Canada), Hendrik Lenstra 
(The Netherlands), Wen-Ch’ing Winnie Li (USA), Barry Mazur (USA), André 
Néron (France), Ravi Ramakrishna (USA), Michel Raynaud (France), Ken Ri- 
bet (USA), Karl Rubin (USA), Jean-Pierre Serre (France), Goro Shimura (Japan), 
Yutaka Taniyama (Japan), John Tate (USA), Richard Taylor (England), Jacques 
Tilouine (France), Jerry Tunnell (USA), André Weil (France), Andrew Wiles (Eng- 
land). 


Exercises 
4.1. Write a one- to two-page biography on one (or more) of the following mathematicians. 


Be sure to describe their mathematical achievements, especially in number theory, and 
some details of their lives. Also include a paragraph putting them into an historical context 


*An elliptic curve is a certain sort of curve, not an ellipse, given by an equation of the form 
y” = x? + ax” + bx +c, where a, b, c are integers. The elliptic curve is semistable if the quantities 
3b — a? and 27c — 9ab + 2a° have no common factors other than 2 and satisfy a few other technical 
conditions. We study elliptic curves in Chapters 41-46. 

3An elliptic curve is called modular if there is a map to it from another special sort of curve called 
a modular curve. 
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by describing the times (scientifically, politically, socially, etc.) during which they lived 
and worked: (a) Niels Abel, (b) Claude Gaspar Bachet de Meziriac, (c) Richard Dedekind, 
(d) Diophantus of Alexandria, (e) Lejeune Dirichlet, (f) Eratosthenes, (g) Euclid of Alexan- 
dria, (h) Leonhard Euler, (i) Pierre de Fermat, (j) Leonardo Fibonacci, (k) Carl Friedrich 
Gauss, (1) Sophie Germain, (m) David Hilbert, (n) Carl Jacobi, (0) Leopold Kronecker, 
(p) Ernst Kummer, (q) Joseph-Louis Lagrange, (r) Adrien-Marie Legendre, (s) Joseph Li- 
ouville, (t) Marin Mersenne, (u) Hermann Minkowski, (v) Sir Isaac Newton, (w) Pythago- 
ras, (x) Srinivasa Ramanujan, (y) Bernhard Riemann, (z) P.L. Tchebychef (also spelled 
Chebychev). 


4.2. The equation a? + b* = c’ has lots of solutions in positive integers, while the equation 
a? + b? = c} has no solutions in positive integers. This exercise asks you to look for 
solutions to the equation 

a+b =? (*) 


imintegersie = b 2 a. <1. 

(a) The equation (*) has the solution (a, b,c) = (2,2,4). Find three more solutions in 
positive integers. [Hint. Look for solutions of the form (a, b,c) = (xz, yz, 27). Not 
every choice of z, y, z will work, of course, so you’ll need to figure out which ones 
do work. ] 

(b) If (A, B,C) is a solution to (*) and n is any integer, show that (n?.A,n?B,n3C) is 
also a solution to («). We will say that a solution (a, b, c) to (*) is primitive if it does 
not look like (n?.A,n?B,n°C) for any n > 2. 

(c) Write down four different primitive solutions to (*). [That is, redo (a) using only 
primitive solutions. ] 

(d) The solution (2, 2, 4) has a = b. Find all primitive solutions that have a = b. 

(e) Find a primitive solution to (*) that has a > 10000. 


Chapter 5 


Divisibility and the Greatest 
Common Divisor 


As we have already seen in our study of Pythagorean triples, the notions of divis- 
ibility and factorizations are important tools in number theory. In this chapter we 
will look at these ideas more closely. 

Suppose that m and n are integers with m 4 0. We say that m divides n if n is 
a multiple of m, that is, if there is an integer & such that n = mk. If m divides n, 
we write m|n. Similarly, if m does not divide n, then we write m { n. For example, 


3/6" cand: | 12/1325. sides “69:2 sand: 132; 12:11. 


The divisors of 6 are 1, 2, 3, and 6. On the other hand, 5 { 7, since no integer 
multiple of 5 is equal to 7. A number that divides n is called a divisor of n. 

If we are given two numbers, we can look for common divisors, that is, num- 
bers that divide both of them. For example, 4 is a common divisor of 12 and 20, 
since 4|12 and 4|20. Notice that 4 is the largest common divisor of 12 and 20. 
Similarly, 3 is a common divisor of 18 and 30, but it is not the largest, since 6 
is also a common divisor. The largest common divisor of two numbers is an ex- 
tremely important quantity that will frequently appear during our number theoretic 
excursions. 


The greatest common divisor of two numbers a and b (not both zero) 
is the largest number that divides both of them. It is denoted by 
gcd(a, b). If gcd(a, b) = 1, we say that a and b are relatively prime. 


Two examples that we mentioned above are 


ged 125204 and ecd(1330) = 6. 
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Another example is 
ged(225,.120)7= 15. 


We can check that this answer is correct by factoring 225 = 3? - 5? and 120 = 
2° . 3-5, but, in general, factoring a and b is not an efficient way to compute their 
greatest common divisor.! 

The most efficient method known for finding the greatest common divisors of 
two numbers is called the Euclidean algorithm. It consists of doing a sequence of 
divisions with remainder until the remainder is zero. We will illustrate with two 
examples before describing the general method. 

As our first example, we will compute gcd(36, 132). The first step is to di- 
vide 132 by 36, which gives a quotient of 3 and a remainder of 24. We write this 
as 

132 = 3 x 36 4+ 24. 


The next step is to take 36 and divide it by the remainder 24 from the previous step. 
This gives 
36 = 1 x 24+ 12. 


Next we divide 24 by 12, and we find a remainder of 0, 
24=2x12+0. 


The Euclidean algorithm says that as soon as you get a remainder of 0, the re- 
mainder from the previous step is the greatest common divisor of the original two 
numbers. So in this case we find that gcd(132, 36) = 12. 

Let’s do a larger example. We will compute 


gcd(1160718174, 316258250). 


Our reason for doing a large example like this is to help convince you that the 
Euclidean algorithm gives a far more efficient way to compute gcd’s than factor- 
ization. We begin by dividing 1160718174 by 316258250, which gives 3 with a 
remainder of 211943424. Next we take 316258250 and divide it by 211943424. 
This process continues until we get a remainder of 0. The calculations are given in 


‘An even less efficient way to compute the greatest common divisor of a and b is the method 
taught to my daughter by her fourth grade teacher, who recommended that the students make com- 
plete lists of all the divisors of a and b and then pick out the largest number that appears on both 
lists! 
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the following table: 


1160718174 = 3 x 316258250 4+ 211943424 
316258250 = 1 x 211943424 + 104314826 
211943424 = 2 x 104314826+ 3313772 
104314826 = 31 x 3313772+ 1587894 

3313772 = 2x 1587894+ 137984 
1587894 = 11x 137984+ 70070 


137984 = 1x 70070+ 67914 
70070= = 1 x 67914 + 2156 

67914 = 31 x 2156 + + ged 
2156 = 2x 1078 + 0 


Notice how at each step we divide a number A by a number B to get a quotient Q 
and a remainder R. In other words, 


A=QxB+R. 


Then at the next step we replace our old A and B with the numbers B and R and 
continue the process until we get a remainder of 0. At that point, the remainder R 
from the previous step is the greatest common divisor of our original two numbers. 
So the above calculation shows that 


gcd(1160718174, 316258250) = 1078. 


We can partly check our calculation (always a good idea) by verifying that 1078 is 
indeed a common divisor. Thus 


1160718174 = 1078 x 1076733 and 316258250 = 1078 x 293375. 


There is one more practical matter to be mentioned before we undertake a 
theoretical analysis of the Euclidean algorithm. If we are given A and B, how can 
we find the quotient Q and the remainder R? Of course, you can always use long 
division, but that can be time consuming and subject to arithmetic errors if A and B 
are large. A pleasant alternative is to find a calculator or computer program that will 
automatically compute Q and R for you. However, even if you are only equipped 
with an inexpensive calculator, there is an easy three-step method to find Q and R. 


Method to Compute Q and R ona Calculator So That A= BxQ+R 


1. Use the calculator to divide A by B. You get a number with decimals. 
2. Discard all the digits to the right of the decimal point. This gives Q. 
3. To find R, use the formula R = A— Bx Q. 
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For example, suppose that A = 12345 and B = 417. Then A/B = 29.6043..., 
so Q = 29 and R = 12345 — 417 - 29 = 252. 

We’re now ready to analyze the Euclidean algorithm. The general method 
looks like 


a= qi xb + Ty 

ae Gs ee en 

Cir Ga Fo. “aie kB 
ae TA 


C2 Eats 


Tn—3 = In—1 X Tn-2 + Tn-1 
Tn—-2 = Qn XTn-1 + = gcd 
Tr-1 = Qn+1T?n + 0 


If we let ro = b and r_; = a, then every line looks like 
Ti-1 = Gi41 X Ti + Ti41- 


Why is the last nonzero remainder r,, a common divisor of a and 6? We start 
from the bottom and work our way up. The last line r,-1 = dn+17n shows that r, 
divides r,_;. Then the previous line 


Tn—-2 = In X Tn-1 + Tn 


shows that r,, divides r,_2, since it divides both r,_; and r,,. Now looking at the 
line above that, we already know that r,, divides both r,_; and r,_2, so we find 
that r,, also divides 7,3. Moving up line by line, when we reach the second line we 
will already know that r,, divides rg and r;. Then the second line b = qz x 71 + r2 
tells us that r,, divides b. Finally, we move up to the top line and use the fact 
that r,, divides both r; and 6b to conclude that r,, also divides a. This completes our 
verification that the last nonzero remainder 7, is a common divisor of a and b. 

But why is r,, the greatest common divisor of a and b? Suppose that d is any 
common divisor of a and b. We will work our way back down the list of equations. 
So from the first equation a = q, x 6+ r; and the fact that d divides both a and 6, 
we see that d also divides r;. Then the second equation b = gar; + r2 shows us 
that d must divide rz. Continuing down line by line, at each stage we will know 
that d divides the previous two remainders 7;_; and r;, and then the current line 
Ti-1 = Gi41 X TE + Ti41 Will tell us that d also divides the next remainder r;+1. 
Eventually, we reach the penultimate line r,_2 = dn X Tn—1 + Tn, at which point 
we conclude that d divides r,,. So we have shown that if d is any common divisor 
of a and 6 then d will divide r,,. Therefore, r,, must be the greatest common divisor 
of a and b. 
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This completes our verification that the Euclidean algorithm actually com- 
putes the greatest common divisor, a fact of sufficient importance to be officially 
recorded. 


Theorem 5.1 (Euclidean Algorithm). To compute the greatest common divisor of 
two numbers a and b, let r_; = a, let ro = b, and compute successive quotients 
and remainders 

Ti-1 = Gita X TET Ti41 
fori = 0,1, 2,... until some remainder ry+, is 0. The last nonzero remainder ry, 
is then the greatest common divisor of a and b. 


There remains the question of why the Euclidean algorithm always finishes. In 
other words, we know that the last nonzero remainder will be the desired gcd, but 
how do we know that we ever get a remainder that does equal 0? This is not a 
silly question, since it is easy to give algorithms that do not terminate; and there 
are even very simple algorithms for which it is not known whether or not they 
always terminate. Fortunately, it is easy to see that the Euclidean algorithm always 
terminates. The reason is simple. Each time we compute a quotient with remainder, 


A=QxB+R, 


the remainder will be between 0 and B — 1. This is clear, since if R > B, then we 
can add one more onto the quotient Q and subtract B from R. So the successive 
remainders in the Euclidean algorithm continually decrease: 


C= 75 STi She Pa Oe 


But all the remainders are greater than or equal to 0, so we have a strictly decreasing 
sequence of nonnegative integers. Eventually, we must reach a remainder that 
equals 0; in fact, it is clear that we will reach a remainder of 0 in at most b steps. 
Fortunately, the Euclidean algorithm is far more efficient than this. You will show 
in the exercises that the number of steps in the Euclidean algorithm is at most seven 
times the number of digits in b. So, on a computer, it is quite feasible to compute 
gcd(a, b) when a and b have hundreds or even thousands of digits! 


Exercises 


5.1. Use the Euclidean algorithm to compute each of the following gcd’s. 
(a) gcd(12345, 67890) (b) gcd (54321, 9876) 


5.2. ss Write a program to compute the greatest common divisor gcd(a, b) of two inte- 


gers a and b. Your program should work even if one of a or b is zero. Make sure that you 
don’t go into an infinite loop if a and b are both zero! 
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5.3. Let b = ro, 71, T2, ... be the successive remainders in the Euclidean algorithm applied 
to a and b. Show that after every two steps, the remainder is reduced by at least one half. 
In other words, verify that 


1 
Rag fi for every t= DL oe ers 
Conclude that the Euclidean algorithm terminates in at most 2 log,(b) steps, where log, is 
the logarithm to the base 2. In particular, show that the number of steps is at most seven 
times the number of digits in b. [Hint. What is the value of log,(10)?] 


5.4. A number L is called a common multiple of m and n if both m and n divide L. 
The smallest such L is called the least common multiple of m and n and is denoted by 
LCM(m, n). For example, LCM(3, 7) = 21 and LCM(12, 66) = 132. 
(a) Find the following least common multiples. 
(i) LCM(8,12) (i) LCM(20,30) (ii) LCM(51,68) (iv) LCM(23, 18). 
(b) For each of the LCMs that you computed in (a), compare the value of LCM(m, n) 
to the values of m, n, and gcd(m, n). Try to find a relationship. 
(c) Give an argument proving that the relationship you found is correct for all m and n. 
(d) Use your result in (b) to compute LCM (301337, 307829). 
(e) Suppose that gcd(m,n) = 18 and LCM(m,n) = 720. Find m and n. Is there more 
than one possibility? If so, find all of them. 


5.5. The “3n + 1 algorithm” works as follows. Start with any number n. If n is even, 
divide it by 2. If n is odd, replace it with 3n + 1. Repeat. So, for example, if we start 
with 5, we get the list of numbers 


Se lOue Aetna od Aes a 
and if we start with 7, we get 
(y22,11,34,1 7,92, 20-13) 40, 20), 10,5516, 8742042 dys eos 


Notice that if we ever get to 1 the list just continues to repeat with 4, 2, 1’s. In general, one 
of the following two possibilities will occur: 


(i) We may end up repeating some number a that appeared earlier in our list, in which 
case the block of numbers between the two a’s will repeat indefinitely. In this case 
we Say that the algorithm terminates at the last nonrepeated value, and the number 
of distinct entries in the list is called the length of the algorithm. For example, the 
algorithm terminates at 1 for both 5 and 7. The length of the algorithm for 5 is 6, 
and the length of the algorithm for 7 is 17. 


(ii) We may never repeat the same number, in which case we say that the algorithm does 
not terminate. 


*There is, of course, a third possibility. We may get tired of computing and just stop working, in 
which case one might say that the algorithm terminates due to exhaustion of the computer! 
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(a) Find the length and terminating value of the 3n+1 algorithm for each of the following 
starting values of n: 


(i)n=21 G@i)n=13 ~ (iii)n=31 


(b) Do some further experimentation and try to decide whether the 3n + 1 algorithm 
always terminates and, if so, at what value(s) it terminates. 

(c) Assuming that the algorithm terminates at 1, let L(n) be the length of the algorithm 
for starting value n. For example, L(5) = 6 and L(7) = 17. Show that ifn = 8k+4 
with k > 1, then L(n) = L(n+1). [Hint. What does the algorithm do to the starting 
values 8k + 4 and 8k + 5?] 

(d) Show that ifn = 128k + 28 then L(n) = Lin +1) = Lin + 2). 

(e) Find some other conditions, similar to those in (c) and (d), for which consecutive 
values of n have the same length. (It might be helpful to begin by using the next 
exercise to accumulate some data.) 


5.6. == Write a program to implement the 3n + 1 algorithm described in the previous 
exercise. The user will input n and your program should return the length L(n) and the 
terminating value T'(n) of the 3n + 1 algorithm. Use your program to create a table giving 
the length and terminating value for all starting values 1 < n < 100. 


Chapter 6 


Linear Equations and the 
Greatest Common Divisor 


Given two whole numbers a and b, we are going to look at all the possible numbers 
we can get by adding a multiple of a to a multiple of b. In other words, we will 
consider all numbers obtained from the formula 


ax + by 


when we substitute all possible integers for x and y. Note that we are going to 
allow both positive and negative values for x and y. For example, we could take 
a = 42 and b = 30. Some of the values of ax + by for this a and 6 are given in the 
following table: 


2a 
peat | ate [mare re | a0 fe a a 


[ys of =e -si[ =e [of eae 

[| —s4[ ia 30| 72| 114] 156 | 
pee |e | ae | 00 [ror ra 0 
a 


Table of Values of 42x + 30y 


Our first observation is that every entry in the table is divisible by 6. This is not 
surprising, since both 42 and 30 are divisible by 6, so every number of the form 
42x + 30y = 6(7x + 5y) is a multiple of 6. More generally, it is clear that ev- 
ery number of the form az + by is divisible by gcd(a, b), since both a and b are 
divisible by gcd(a, b). 
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A second observation, which is somewhat more surprising, is that the greatest 
common divisor of 42 and 30, which is 6, actually appears in our table. Thus from 
the table we see that 


42 - (—2) + 30-3 = 6 = gcd(42, 30). 


Further examples suggest the following conclusion: 


The smallest positive value of 
ax + by 
is equal to gcd(a, b). 


There are many ways to prove that this is true. We will take a constructive ap- 
proach, via the Euclidean algorithm, which has the advantage of giving a proce- 
dure for finding the appropriate values of x and y. In other words, we are going to 
describe a method of finding integers x and y that are solutions to the equation 


ax + by = gcd(a, b). 


Since, as we have already observed, every number ax + by is divisible by gcd(a, b), 
it will follow that the smallest positive value of ax + by is precisely gcd(a, b). 

How might we solve the equation ax + by = gcd(a, b)? If a and b are small, 
we might be able to guess a solution. For example, the equation 


10z + 35y = 5 
has the solution x = —3 and y = 1, and the equation 

Telly = 1 
has the solution x = —3 and y = 2. We also notice that there can be more than 
one solution, since x = 8 and y = —5 is also a solution to 7z + lly = 1. 


However, if a and b are large, neither guesswork nor trial and error is going to 
be helpful. We are going to start by illustrating the Euclidean algorithm method 
for solving ax + by = gcd(a, b) with a particular example. So we are going to try 
to solve 

22x + 60y = gcd(22, 60). 


The first step is to perform the Euclidean algorithm to compute the gcd. We find 


60 = 2 x 22+ 16 
22=1x 16+ 6 
16=. 26+ 4 
Oise de Ae 
A= 22 0 
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This shows that gcd(22, 60) = 2, a fact that is clear without recourse to the Eu- 
clidean algorithm. However, the Euclidean algorithm computation is important 
because we’re going to use the intermediate quotients and remainders to solve the 
equation 22x + 60y = 2. The first step is to rewrite the first equation as 


16 = a — 26, where we let a = 60 and b = 22. 


We next substitute this value into the 16 appearing in the second equation. This 
gives (remember that b = 22) 


b=1x 16+6=1-x (a— 2b) +6. 

Rearranging this equation to isolate the remainder 6 yields 
6 = b— (a — 2b) = —a + 3b. 
Now substitute the values 16 and 6 into the next equation, 16 = 2 x 6 + 4: 
a—2b=16=2x6+4=2(-a+ 3b)+4. 
Again we isolate the remainder 4, yielding 
4 = (a — 2b) — 2(—a + 3b) = 3a — 8b. 
Finally, we use the equation 6 = 1 x 4+ 2 to get 
—a+3b=6=1x4+2=1 x (8a — 8b) +2. 
Rearranging this equation gives the desired solution 
—4a+ 11b=2. 


(We should check our solution: —4 x 60 + 11 x 22 = —240 4+ 242 = 2.) 

We can summarize the above computation in the following efficient tabular 
form. Note that the left-hand equations are the Euclidean algorithm, and the right- 
hand equations compute the solution to az + by = gcd(a, db). 


a= 2x b+ 16 16 =a— 2b 

b=1x 16+ 6 6=b-—1x 16 
=b—1x (a— 26) 
= —a+3b 

16= 2x64+ 4 4=16-2x6 
= (a — 2b) — 2 x (—a+ 3b) 
= 3a — 8b 

G61 Ae 2 B= be ed 
= (—a + 3b) — 1 x (3a — 8b) 
= —4a+11b 

2 ae a 
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Why does this method work? As the following table makes clear, we start with 
the first two lines of the Euclidean algorithm, which involve the quantities a and b, 
and work our way down. 


a= qmb+r, |ry =a—qyb 

b=gritre |r2=b- geri 

= b— q(a— mb) 

—qa+ (1+ qiqe)b 

T1 = @3T2 +73 | 73 =7T1 — Q3re 

= (a— qb) — 4 (—qoa +(1+ q192)b) 
= (1+ qeq3)a — (m + 93 + 919293) 


As we move from line to line, we will continually be forming equations that look 
like 
latest remainder = some multiple of a plus some multiple of 0. 


Eventually, we get down to the last nonzero remainder, which we know is equal to 

gcd(a, b), and this gives the desired solution to the equation gcd(a, b) = ax + by. 
A larger example with a = 12453 and b = 2347 is given in tabular form on top 

of the next page. As before, the left-hand side is the Euclidean algorithm and the 

right-hand side solves az + by = gcd(a,b). We see that gcd(12453, 2347) = 1 

and that the equation 124532 + 2347y = 1 has the solution (x, y) = (304, —1613). 
We now know that the equation 


ax + by = gcd(a, b) 


always has a solution in integers x and y. The final topic we discuss in this section 
is the question of how many solutions it has, and how to describe all the solutions. 
Let’s start with the case that a and b are relatively prime, that is, gcd(a, b) = 1, and 
suppose that (x1, yi) is a solution to the equation 


ax + by = 1. 


We can create additional solutions by subtracting a multiple of b from x; and 
adding the same multiple of a onto y;. In other words, for any integer k we obtain 
a new solution (x1 + kb, y; — ka).! We can check that this is indeed a solution by 
computing 


a(x; + kb) + b(y1 — ka) = ax, + akb + by — bka = ax, 4+ by; = 1. 


‘Geometrically, we are starting from the known point (21, y1) on the line ax + by = 1 and using 
the fact that the line has slope —a/b to find new points (x1 + t, y1 — (a/b)t). To get new points with 
integer coordinates, we need to let ¢ be a multiple of b. Substituting t = kb gives the new integer 
solution (11 + kb, y1 — ka). 
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a= 5x0 (+718 718 =a—5b 

b=3 x 718+ 193 193 =b-—3.x 718 

= b—3 x (a — 5b) 

= —3a+ 16) 

718 = 3 x 1934 139 = 718 —3 x 193 

= (a — 5b) — 3 x (—3a + 16D) 

= 10a — 530 

193 =1 x 139+ 54 = 193 — 139 

= (—3a + 16b) — (10a — 536) 

= —13a + 69b 

139 =2x 54 +31 S12 DA. 

= (10a — 53b) — 2 x (—13a + 690) 
= 36a — 191b 

pA Ix ol = 23 = 54 —- 31 

= —13a + 696 — (36a — 1916) 

= —49a + 2606 

ol =16 26 +S 8 = 31 — 23 

= 36a — 191b — (—49a + 2606) 
= 85a — 4516 

ees T= 2328 

= (—49a + 260b) — 2 x (85a — 451b) 
= —219a + 11626 

+1 1=8-7 

= 85a — 451b — (—219a + 11625) 
= 304a — 16136 


So, for example, if we start with the solution (—1, 2) to 5” + 3y = 1, we obtain 
new solutions (—1 + 3k, 2 — 5k). Note that the integer & is allowed to be positive, 
negative, or zero. Putting in particular values of k gives the solutions 

el = 18022), 4 =O ATG 12y) (4 es (1, 2), 
(2,3), (5,8), (8,13), (11,-18).... 
Still looking at the case that gcd(a,b) = 1, we can show that this procedure 


gives all possible solutions. Suppose that we are given two solutions (21, y;) and 
(x2, y2) to the equation az + by = 1. In other words, 


ax, + by; = 1 and ax2 + by = 1. 


We are going to multiply the first equation by ye, multiply the second equation 
by y,, and subtract. This will eliminate 6 and, after a little bit of algebra, we are 
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left with 
aX 1Y2 — AT2Y1 = Y2— Yi- 
Similarly, if we multiply the first equation by x2, multiply the second equation 
by £1, and subtract, we find that 
brey1 — bxryyo2 = 2-71. 
So if we let k = xoy1 — 11 Y2, then we find that 
Lg = 21 +kb and yo = y1 — ka. 


This means that the second solution (x2, y2) is obtained from the first solution 
(x1, y1) by adding a multiple of b onto x; and subtracting the same multiple of a 
from y1. So every solution to az + by = 1 can be obtained from the initial solu- 
tion (71, y1) by substituting different values of k into (11 + kb, yi — ka). 

What happens if gcd(a, b) > 1? To make the formulas look a little bit simpler, 
we will let g = gcd(a,b). We know from the Euclidean algorithm method that 
there is at least one solution (xj, y1) to the equation 


ax + by = g. 


But g divides both a and b, so (21, yi) is a solution to the simpler equation 
b 
{ +-y=1. 
9 g 


Now our earlier work applies, so we know that every other solution can be obtained 
by substituting values for k in the formula 


b 
(+e 2, n-kS). 
9 9 


This completes our description of the solutions to the equation ax + by = g, as 
summarized in the following theorem. 


Theorem 6.1 (Linear Equation Theorem). Let a and b be nonzero integers, and let 
g = gcd(a, b). The equation 
ax+by=g9 


always has a solution (x1, y1) in integers, and this solution can be found by the 
Euclidean algorithm method described earlier. Then every solution to the equation 
can be obtained by substituting integers k into the formula 


b 
(+e %, n-k-S). 
g g 
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For example, we saw that the equation 
60x + 22y = gcd(60, 22) = 2 


has the solution x = —4, y = 11. Then our Linear Equation Theorem says that 
every solution is obtained from the formula 


(—4 + 11k, 11 — 30k) with & any integer. 


In particular, if we want a solution with x positive, then we can take k = 1, which 
gives the smallest such solution (x, y) = (7, —19). 
In this chapter we have shown that the equation 


ax + by = gcd(a, b) 


always has a solution. This fact is extremely important for both theoretical and 
practical reasons, and we will be using it repeatedly in our number theoretic in- 
vestigations. For example, we will need to solve the equation ax + by = 1 when 
we study cryptography in Chapter 18. And in the next chapter we will use this 
equation for our theoretical study of factorization of numbers into primes. 


Exercises 


6.1. (a) Find a solution in integers to the equation 


123452 + 67890y = gcd(12345, 67890). 


(b) Find a solution in integers to the equation 
543212 + 9876y = gcd(54321, 9876). 


6.2. Describe all integer solutions to each of the following equations. 
(a) 1052 +121ly =1 
(b) 123452 + 67890y = gcd(12345, 67890) 
(c) 543212 + 9876y = gcd(54321, 9876) 


6.3. {4 The method for solving ax + by = gced(a, b) described in this chapter involves 
a considerable amount of manipulation and back substitution. This exercise describes an 
alternative way to compute x and y that is especially easy to implement on a computer. 

(a) Show that the algorithm described in Figure 6.1 computes the greatest common divi- 
sor g of the positive integers a and 0, together with a solution (, y) in integers to the 
equation ax + by = gcd(a, b). 

(b) Implement the algorithm on a computer using the computer language of your choice. 
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(c) Use your program to compute g = gcd(a, b) and integer solutions to ax + by = g for 

the following pairs (a, b). 
(i) (19789, 23548) (ii) (31875, 8387) (iti) (22241739, 19848039) 

(d) What happens to your program if b = 0? Fix the program so that it deals with this 
case correctly. 

(e) For later applications it is useful to have a solution with x > 0. Modify your program 
so that it always returns a solution with x > 0. [Hint. If (x, y) is a solution, then so 
is (x + b,y —a).] 


Gt Sete 19g =o) v= 0--and ag =D: 

(2) Ifw =O then set y = (g — ax) /b and return the values (g, x, y). 
(3) Divide g by w with remainder, g = qu + t, withO <t<w. 

(4) Sets=2— qu. 

(5) Set(ag) = (v.40). 

(6) Set (v,w) = (s,t). 

(7) Go to Step (2). 


Figure 6.1: Efficient algorithm to solve ax + by = gcd(a, b) 


6.4. (a) Find integers z, y, and z that satisfy the equation 


6x + 15y + 20z = 1. 


(b) Under what conditions on a, 0, c is it true that the equation 
ax+ by+cz=1 


has a solution? Describe a general method of finding a solution when one exists. 
(c) Use your method from (b) to find a solution in integers to the equation 


1552 + 341y + 3852 = 1. 


6.5. Suppose that gcd(a, b) = 1. Prove that for every integer c, the equation az + by = c 
has a solution in integers x and y. [Hint. Find a solution to au+ bv = 1 and multiply by c.] 
Find a solution to 37z + 47y = 103. Try to make x and y as small as possible. 


6.6. Sometimes we are only interested in solutions to ax + by = c using nonnegative val- 
ues for x and y. 
(a) Explain why the equation 3x + 5y = 4 has no solutions with x > 0 and y > 0. 
(b) Make a list of some of the numbers of the form 3z + 5y with x > 0 and y > 0. Make 
a conjecture as to which values are not possible. Then prove that your conjecture is 
correct. 
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(c) For each of the following values of (a, b), find the largest number that is not of the 
form ax + by with x > 0 and y > 0. 


(i) (a,b) = (3,7) (ii) (a,b) = (5,7) (iii) (a,b) = (4,11). 


(d) Let gcd(a,b) = 1. Using your results from (c), find a conjectural formula in terms 
of a and 6 for the largest number that is not of the form az + by with x > 0 and 
y > 0? Check your conjecture for at least two more values of (a, b). 

(e) Prove that your conjectural formula in (d) is correct. 

(f) Try to generalize this problem to sums of three terms az + by+ cz with x > 0, 
y > 0, and z > 0. For example, what is the largest number that is not of the form 
6x + 10y + 15z with nonnegative z, y, z? 


Chapter 7 


Factorization and the 
Fundamental Theorem 
of Arithmetic 


A prime number is a number p > 2 whose only (positive) divisors are 1 and p. 
Numbers m > 2 that are not primes are called composite numbers. For example, 


prime numbers PAs 3 al as USL es So Pan 2 28 ae) a ot Ue 
composite numbers 4,6,8,9, 10,12, 14,15, 16,18, 20,... 


Prime numbers are characterized by the numbers by which they are divisible; that 
is, they are defined by the property that they are only divisible by 1 and by them- 
selves. So it is not immediately clear that primes numbers should have special 
properties that involve the numbers that they divide. Thus the following fact con- 
cerning prime numbers is both nonobvious and important.! 


Lemma 7.1. Let p be a prime number, and suppose that p divides the product ab. 
Then either p divides a or p divides b (or p divides both a and b).? 


Proof. We are given that p divides the product ab. If p divides a, we are done, so 
we may as well assume that p does not divide a. Now consider what gcd(p, a) can 
be. It divides p, so it is either 1 or p. It also divides a, so it isn’t p, since we have 
assumed that p does not divide a. Thus, gcd(p, a) must equal 1. 


‘A lemma is a result that is used as a stepping stone for proving other results. 

*You may say that this lemma is obvious if we look at the prime factorizations of a and b. How- 
ever, the fact that a number can be factored into a product of primes in exactly one way is itself a 
nonobvious fact. We will discuss this further later in this chapter. 
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Now we use the Linear Equation Theorem (Chapter 6) with the numbers p 
and a. The Linear Equation Theorem says that we can find integers x and y that 
solve the equation 

pee ay =, 


[Note that we are using the fact that gcd(p,a) = 1.] Now multiply both sides of 
the equation by b. This gives 


pba + aby = b. 


Certainly pbz is divisible by p, and also aby is divisible by p, since we know that p 
divides ab. It follows that p divides the sum 


pbx + aby, 


so p divides b. This completes the proof of the lemma.* L 


The lemma says that if a prime divides a product ab, it must divide one of 
the factors. Notice that this is a special property of prime numbers; it is not true 
for composite numbers. For example, 6 divides the product 15 - 14, but 6 divides 
neither 15 nor 14. It is not hard to extend the lemma to products with more than 
two factors. 


Theorem 7.2 (Prime Divisibility Property). Let p be a prime number, and sup- 
pose that p divides the product a,a2---a,. Then p divides at least one of the 
factors a1,92,...,Qr. 


Proof. If p divides a;, we’re done. If not, we apply the lemma to the product 
ay (a2a3 ues ar) 


to conclude that p must divide aga3---a,. In other words, we are applying the 
lemma with a = a, and b = aga3---a,. We know that pljab, so if p { a, the lemma 
says that p must divide b. 

So now we know that p divides aga3---a,. If p divides a2, we’re done. If 
not, we apply the lemma to the product a2(a3---a,) to conclude that p must di- 
vide a3 ---a,. Continuing in this fashion, we must eventually find some a; that is 
divisible by p. irl 


3When we are proving a statement, we use a little box C1 to indicate that we have completed the 
proof. Some books instead use QED to indicate the end of a proof. The letters QED stand for the 
Latin phrase Quod erat demonstrandum, which roughly means “that which was to be proved.” This 
in turn comes from the Greek phrase wzep ede devEat, which appears in Euclid’s Elements. 
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Later in this chapter we are going to use the Prime Divisibility Property to 
prove that every positive integer can be factored as a product of prime numbers 
in essentially one way. Unfortunately, this important fact is so familiar to most 
readers that they will question why it requires a proof. So before giving the proof, 
I want to try to convince you that unique factorization into primes is far from being 
obvious. For this purpose, I invite you to leave the familiar behind and enter the* 


Even Number World 
(popularly known as the “E-Zone’”’) 


Imagine yourself in a world where the only numbers that are known are the even 
numbers. So, in this world, the only numbers that exist are 


Baek 80.42.00 46. 810,52) 


Notice that in the E-Zone we can add, subtract, and multiply numbers just as usual, 
since the sum, difference, and product of even numbers are again even numbers. 
We can also talk about divisibility. We say that a number m E-divides a number n 
if there is a number k with n = mk. But remember that we’re now in the E-Zone, 
so the word “number” means an even number. For example, 6 E-divides 12, since 
12 = 6 - 2; but 6 does not E-divide 18, since there is no (even) number & satisfying 
18 = 6k. 

We can also talk about primes. We say that an (even) number p is an E-prime if 
it is not divisible by any (even) numbers. (In the E-Zone, a number is not divisible 
by itself!) For example, here are some E-primes: 


2, 6, 10, 14, 18, 22, 26, 30. 


Recall the lemma we proved above for ordinary numbers. We showed that if 
a prime p divides a product ab then either p divides a or p divides b. Now move 
to the E-Zone and consider the E-prime 6 and the numbers a = 10 and b = 18. 
The number 6 E-divides ab = 180, since 180 = 6 - 30; but 6 E-divides neither 10 
nor 18. So our “obvious” lemma is not true here in the E-Zone! 

There are other “self-evident facts” that are untrue in the E-Zone. For exam- 
ple, consider the fact that every number can be factored as a product of primes in 
exactly one way. (Of course, rearranging the order of the factors is not considered 
a different factorization.) It’s not hard to show, even in the E-Zone, that every 
(even) number can be written as a product of E-primes. But consider the following 
factorizations: 

180 = 6-30 = 10-18. 
“Since this book is not a multimedia product, you’ll have to use your imagination to supply the 
appropriate Twilight Zone music. 
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Notice that all of the numbers 6, 30, 10, and 18 are E-primes. This means that 180 
can be written as a product of E-primes in two fundamentally different ways! In 
fact, there is even a third way to write it as a product of E-primes, 


180 = 2- 90. 


We are going to leave the E-Zone now and return to the familiar world where 
odd and even numbers live together in peace and harmony. But we hope that our 
excursion into the E-Zone has convinced you that facts that seem obvious require 
a healthy dose of skepticism. Especially, any “fact” that “must be true” because it 
is very familiar or because it is frequently proclaimed to be true is a fact that needs 
the most careful scrutiny.» 


E-Zone Border Crossing—Welcome Back Home 


Everyone “knows” that a positive integer can be factored into a product of primes 
in exactly one way. But our visit to the E-Zone provides convincing evidence that 
this obvious assertion requires a careful proof. 


Theorem 7.3 (The Fundamental Theorem of Arithmetic). Every integer n > 2can 
be factored into a product of primes 


== Pipe De 
in exactly one way. 


Before we commence the proof of the Fundamental Theorem of Arithmetic, a 
few comments are in order. First, if n itself is prime, then we just write n = n and 
consider this to be a product consisting of a single number. Second, when we write 
nN = p1p2:--Pr, we do not mean that 1, po,...,p, have to be different primes. 
For example, we would write 300 = 2-2-3-5-5. Third, when we say that n can 
be written as a product in exactly one way, we do not consider rearrangement of 
the factors to be a new factorization. For example, 12 = 2-2-3 and12=2.3-2 
and 12 = 3 -2- 2, but all these are treated as the same factorization. 


Proof. The Fundamental Theorem of Arithmetic really contains two assertions. 
Assertion 1. The number n can be factored into a product of primes in some way. 


Assertion 2. There is only one such factorization (aside from rearranging the fac- 
tors). 


The principle that well-known and frequently asserted “facts” should be carefully scrutinized 
also applies to endeavors far removed from mathematics. Politics and journalism come to mind, and 
the reader will undoubtedly be able to add many others to the list. 
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We begin with Assertion 1. We are going to give a proof by induction.® Don’t 
let this scare you, it just means that first we’ll verify the assertion for n = 2, and 
then for n = 3, and then for n = 4, and so on. We begin by observing that 2 = 2 
and 3 = 3 and 4 = 2”, so each of these numbers can be written as a product of 
primes. This verifies Assertion 1 for n = 2, 3,4. Now suppose that we’ve verified 
Assertion 1 for every n up to some number, call it NV. This means we know that 
every number n < WN can be factored into a product of primes. Now we’ll check 
that the same is true of N + 1. 

There are two possibilities. First, N + 1 may already be prime, in which case 
it is its own factorization into primes. Second, NV + 1 may be composite, which 
means that it can be factored as N + 1 = nin2 with 2 < nj,no < N. But we 
know Assertion 1 is true for 1 and nga, since they are both less than or equal to NV. 
This means that both n; and n2 can be written as a product of primes, say 


N1 = P1p2°** Pr and n2 = q192°**Ws- 


Multiplying these two products together gives 


N+1=njn2 = pipe :++Prqig2*** Qs; 


so N + 1 can be factored into a product of primes. This means that Assertion 1 is 
true for N + 1. 

To recapitulate, we have shown that if Assertion 1 is true for all numbers less 
than or equal to JN, then it is also true for N + 1. But we have checked it is true 
for 2, 3, and 4, so taking N = 4, we see that it is also true for 5. But then we can 
take NV = 5 to conclude that it is true for 6. Taking NV = 6, we see that it is true for 
N = 7, and so on. Since we can continue this process indefinitely, it follows that 
Assertion 1 is true for every integer. 

Next we tackle Assertion 2. It is possible to give an induction proof for this 
assertion, too, but we will proceed more directly. Suppose that we are able to 
factor n as a product of primes in two ways, say 


TN = Pip2P3P4*** Pr = 91929394 °** Ws: 


We need to check that the factorizations are the same, possibly after rearranging 
the order of the factors. We first observe that p1|n, so pilqiq2-::qs- The Prime 
Divisibility Property proved earlier in this chapter tells us that »; must divide (at 
least) one of the q;’s, so if we rearrange the q;’s, we can arrange matters so that 
p1\qi1- But qi is also a prime number, so its only divisors are 1 and q;. Therefore, 
we must have p; = qj. 


®We’ll discuss induction more formally in Chapter 26. 
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Now we cancel p; (which is the same as q;) from both sides of the equation. 
This gives the equation 


P2P3P4°** Pr = 929394 °°: Qs- 


Briefly repeating the same argument, we note that p2 divides the left-hand side of 
this equation, so p2 divides the right-hand side, and hence by the Prime Divisibility 
Property, p2 divides one of the q,;’s. After rearranging the factors, we get po|qo, 
and then the fact that g2 is prime means that po = qo. This allows us to cancel po 
(which equals q2) to obtain the new equation 


P3P4°**Pr = 4394°** s- 


We can continue in this fashion until either all the p;’s or all the q;’s are gone. 
But if all the p,’s are gone, then the left-hand side of the equation equals 1, so there 
cannot be any q;’s left, either. Similarly, if the q;’s are all gone, then the p;’s must 
all be gone. In other words, the number of p;’s must be the same as the number 
of q;’s. To recapitulate, we have shown that if 


1 = Pi p2P3P4°** Pr = 91929394 °°" Ws; 


where all the p,;’s and q;’s are primes, then r = s, and we can rearrange the q;’s so 
that 


pi =, atid po=qo and pz=—q3 and ... and p, = q. 


This completes the proof that there is only one way to write n as a product of 
primes. L] 


The Fundamental Theorem of Arithmetic says that every integer n > 2 can be 
written as a product of prime numbers. Suppose we are given a particular integer n. 
As a practical matter, how can we write it as a product of primes? If n is fairly small 
(for example, n = 180) we can factor it by inspection, 


1802 90H 2 24 = 22 2S 1S 2 2 333 


If n is larger (for example, n = 9105293) it may be more difficult to find a 
factorization. One method is to try dividing n by primes 2, 3,5, 7,11,... until we 
find a divisor. For n = 9105293, we find after some work that the smallest prime 
dividing n is 37. We factor out the 37, 


9105293 = 37 - 246089, 
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and continue checking 37, 41, 43,... to find a prime that divides 246089. We find 
that 43|246089, since 246089 = 43-5723. And so on until we factor 5723 = 59-97, 
where we recognize that 59 and 97 are both primes. This gives the complete prime 
factorization 

9105293 = 37- 43 - 59 - 97. 


If n is not itself prime, then there must be a prime p < ,/n that divides n. 
To see why this is true, we observe that if p is the smallest prime that divides n, 
then n = pm with m > p, and hence n = pm > p?. Taking the square root of 
both sides yields ,/n > p. This gives the following foolproof method for writing 
any number 7 as a product of primes: 


To write n as a product of primes, try dividing it by every number (or 
just every prime number) 2, 3,... that is less than or equal to \/n. If 
you find no numbers that divide n, then n itself is prime. Otherwise, 
the first divisor that you find will be a prime p. Factor n = pm and 
repeat the process with m. 


This procedure, although fairly inefficient, works fine on a computer for num- 
bers that are moderately large, say up to 10 digits. But how about a number like 
n = 10!78 + 1? If n turns out to be prime, we won’t find out until we’ve checked 
/n = 10% possible divisors. This is completely infeasible. If we could check 
1,000,000,000 (that’s one billion) possible divisors each second, it would still take 
approximately 3 - 1048 years! This leads to the following two closely related ques- 
tions: 


Question 1. How can we tell if a given number n is prime or composite? 


Question 2. If n is composite, how can we factor it into primes? 


Although it might seem that these questions are the same, it turns out that 
Question 1 is much easier to answer than Question 2. We will later see how to 
write down large numbers that we know are composite, even though we will be 
unable to write down any of their factors. In a similar fashion, we will be able 
to find very large prime numbers p and q such that, if we were to send someone 
the value of the product n = pq, they would be unable to factor n to retrieve the 
numbers p and q. This curious fact, that it is very easy to multiply two numbers but 
very difficult to factor the product, lies at the heart of a remarkable application of 
number theory to the creation of very secure codes. We will describe these codes 
in Chapter 18. 
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Exercises 


7.1. Suppose that gcd(a, b) = 1, and suppose further that a divides the product bc. Show 
that a must divide c. 


7.2. Suppose that gcd(a, b) = 1, and suppose further that a divides c and that b divides c. 
Show that the product ab must divide c. 


7.3. Let s and t be odd integers with s > t > 1 and gcd(s,t) = 1. Prove that the three 
numbers 


poe aa gt 422 
a 2 

are pairwise relatively prime; that is, each pair of them is relatively prime. This fact was 
needed to complete the proof of the Pythagorean triples theorem (Theorem 2.1 on page 17). 
[Hint. Assume that there is a common prime factor and use the fact (Lemma 7.1) that if a 


prime divides a product, then it divides one of the factors. ] 


St, 


7.4. Give a proof by induction of each of the following formulas. [Notice that (a) is the 
formula that we proved in Chapter 1 using a geometric argument and that (c) is the first n 
terms of the geometric series. ] 


1 
Gtr foun ee 


yas tee cena = ste) ead) 


6 
1 — q™t! 
(ODS arGs ar? aerate (a # 1) 
—a 
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7 era Taco gece Ca el ged 
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7.5. This exercise asks you to continue the investigation of the E-Zone. Remember as you 
work that for the purposes of this exercise, odd numbers do not exist! 

(a) Describe all E-primes. 

(b) Show that every even number can be factored as a product of E-primes. [Hint. Mimic 
our proof of this fact for ordinary numbers. ] 

(c) We saw that 180 has three different factorizations as a product of E-primes. Find the 
smallest number that has two different factorizations as a product of E-primes. Is 180 
the smallest number with three factorizations? Find the smallest number with four 
factorizations. 

(d) The number 12 has only one factorization as a product of E-primes: 12 = 2 - 6. (As 
usual, we consider 2 - 6 and 6 - 2 to be the same factorization.) Describe all even 
numbers that have only one factorization as a product of E-primes. 


7.6. Welcome to M-World, where the only numbers that exist are positive integers that 
leave a remainder of 1 when divided by 4. In other words, the only M-numbers that exist 
are 

Lebel Ss IOs xl 
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(Another description is that these are the numbers of the form 4¢ + 1 for ¢ = 0,1,2,....) 
In the M-World, we cannot add numbers, but we can multiply them, since if a and b both 
leave a remainder of 1 when divided by 4 then so does their product. (Do you see why this 
is true?) 

We say that m M-divides n if n = mk for some M-number k. And we say that n is 
an M-prime if its only M-divisors are 1 and itself. (Of course, we don’t consider 1 itself to 
be an M-prime.) 

(a) Find the first six M-primes. 
(b) Find an M-number n that has two different factorizations as a product of M-primes. 


7.7. == In this exercise you are asked to write programs to factor a (positive) integer n 
into a product of primes. (If n = 0, be sure to return an error message instead of going into 
an infinite loop!) A convenient way to represent the factorization of n is as a2 x r matrix. 
Thus, if 


_— _k1,,k2 ky 
= Pi Pa Pes 


then store the factorization of n as the matrix 


( po -* e) 

ky kg ee Kk} 

(If your programming language doesn’t allow dynamic storage allocation, you'll have to 
decide ahead of time how many factors to allow.) 

(a) Write a program to factor n by trying each possible factor d = 2,3, 4,5,6,.... (This 
is an extremely inefficient method but will serve as a warm-up exercise.) 

(b) Modify your program by storing the values of the first 100 (or more) primes and 
first removing these primes from n before looking for larger prime factors. You 
can speed up your program when trying larger d’s as potential factors if you don’t 
bother checking d’s that are even, or divisible by 3, or by 5. You can also increase 
efficiency by using the fact that a number m is prime if it is not divisible by any 
number between 2 and ,/m. Use your program to find the complete factorization of 
all numbers between 1,000,000 and 1,000,030. 

(c) Write a subroutine that prints the factorization of n in a nice format. Optimally, 


the exponents should appear as exponents; but if this is not possible, then print the 
factorization of (say) n = 75460 = 27-5-7%-11as 


Cis Mae Sas ee aes ae ae een AK 


(To make the output easier to read, don’t print exponents that equal 1.) 


Chapter 8 


Congruences 


Divisibility is a powerful tool in the theory of numbers. We have seen this amply 
demonstrated in our work on Pythagorean triples, greatest common divisors, and 
factorization into primes. In this chapter we will discuss the theory of congruences. 
Congruences provide a convenient way to describe divisibility properties. In fact, 
they are so convenient and natural that they make the theory of divisibility very 
similar to the theory of equations. 

We say that a is congruent to b modulo m, and we write 


a = b (mod m), 
if m divides a — b. For example, 


7 = 2 (mod 5) and 47 = 35 (mod 6), 
since 
5\((7-2) and  6|(47 — 35). 


In particular, if a divided by m leaves a remainder of r, then a is congruent to r 
modulo m. Notice that the remainder satisfies 0 < r < m, so every integer is con- 
gruent, modulo m, to a number between 0 and m — 1. 

The number m is called the modulus of the congruence. Congruences with the 
same modulus behave in many ways like ordinary equations. Thus, if 


aj =b,; (modm) and azg=b2(modm), then 


aj tag =b;+b2(modm) and ajazg = b1b2 (mod m). 


Warning. It is not always possible to divide congruences. In other words, 
if ac = bc (mod m), it need not be true that a = b (mod m). For example, 
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15 -2 = 20-2 (mod 10), but 15 4 20 (mod 10). Even more distressing, it 
is possible to have 


uv = 0 (mod m) with u 4 0 (mod m) and v #0 (mod m). 


Thus 6 - 4 = 0 (mod 12), but 6 4 0 (mod 12) and 4 # 0 (mod 12). How- 
ever, if gcd(c,m) = 1, then it is okay to cancel c from the congruence 
ac = bc (mod m). You will be asked to verify this as an exercise. 


Congruences with unknowns can be solved in the same way that equations are 
solved. For example, to solve the congruence 


x +12=5 (mod 8), 
we subtract 12 from each side to get 
x =5-—12=-7 (mod 8). 


This solution is fine, or we can use the equivalent solution z = 1 (mod 8). Notice 
that —7 and 1 are the same modulo 8, since their difference is divisible by 8. 
Here’s another example. To solve 


4x = 3 (mod 19), 
we will multiply both sides by 5. This gives 
20x = 15 (mod 19). 
But 20 = 1 (mod 19), so 202 = x (mod 19). Thus the solution is 
x = 15 (mod 19). 


We can check our answer by substituting 15 into the original congruence. Is 
4-15 =3 (mod 19)? Yes, because 4-15 — 3 = 57 = 3 - 19 is divisible by 19. 

We solved this last congruence by a trick, but if all else fails, there’s always 
the “climb every mountain” technique.' To solve a congruence modulo m, we can 
just try each value 0,1,...,m — 1 for each variable. For example, to solve the 
congruence 


a? + 27 —1=0 (mod 7), 


we just try x = 0,7 = 1,..., x = 6. This leads to the two solutions x = 2 (mod 7) 
and x = 3 (mod 7). Of course, there are other solutions, such as x = 9 (mod 7). 


‘Also known as the “ford every stream” technique for those who prefer wet feet to vertigo. 
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But 9 and 2 are not really different solutions, since they are the same modulo 7. 
So when we speak of “finding all the solutions to a congruence,” we normally 
mean that we will find all incongruent solutions, that is, all solutions that are not 
congruent to one another. 

We also observe that there are many congruences, such as x? =3 (mod 10), 
that have no solutions. This shouldn’t be too surprising. After all, there are ordinary 
equations such as x? = —1 that have no (real) solutions. 

Our final task in this chapter is to solve congruences that look like 


az =c (mod m). 
Some congruences of this type have no solutions. For example, if 
6x = 15 (mod 514) 


were to have a solution, then 514 would have to divide 6x — 15. But 6x — 15is al- 
ways odd, so it cannot be divisible by the even number 514. Hence the congruence 
6x2 = 15 (mod 514) has no solutions. 
Before giving the general theory, let’s try an example. We will solve the con- 
gruence 
18x = 8 (mod 22). 


This means we need to find a value of x with 22 dividing 18x — 8, so we have to 
find a value of x with 18x — 8 = 22y for some y. In other words, we need to solve 
the linear equation 

1827 — 2247 = 8. 


We know from Chapter 6 that we can solve the equation 
18 = 220 = ecd(18, 22) = 2, 


and indeed we easily find the solution uw = 5 and v = 4. But we really want the 
right-hand side to equal 8, so we multiply by 4 to get 


18. (5-4) —22- (4-4) =8. 


Thus, 18 - 20 = 8 (mod 22), so x = 20 (mod 22) is a solution to the original 
congruence. We will soon see that this congruence has two different solutions 
modulo 22; the other one turns out to be x = 9 (mod 22). 

Suppose now that we are asked to solve an arbitrary congruence of the form 


az =c (mod m). 
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We need to find an integer x such that m divides az — c. The number m will divide 
the number az — cif we can find an integer y such that az — c = my. Rearranging 
this last equation slightly, we see that az = c (mod m) has a solution if, and only 
if, the linear equation ax — my = c has a solution. This should look familiar; it is 
precisely the sort of problem we solved in Chapter 6. 

To make our formulas a bit neater, we will let g = gcd(a,m). Our first obser- 
vation is that every number of the form az — my is a multiple of g; so if g does not 
divide c, then ax — my = c has no solutions and so az = c (mod m) also has no 
solutions. 

Next suppose that g does divide c. We know from the Linear Equation Theorem 
in Chapter 6 that there is always a solution to the equation 


au+ mv = g. 


Suppose we find a solution u = uo, v = vo, either by trial and error or by using the 
Euclidean algorithm method described in Chapter 6. Since we are assuming that g 
divides c, we can multiply this equation by the integer c/g to obtain the equation 


Cug CUO 
+ 


a— +m— =c. 
g 
This means that 
Cuo : 5 
Lo = — (mod m) isasolution to the congruence ax =c(mod™m). 
g 


Are there other solutions? Suppose that x; is some other solution to the con- 
gruence az =c (mod m). Then az; = azo (mod m), so m divides ax; — azo. 
This implies that 
a(xz1 — Zo) 

g } 
and we know that m/g and a/g have no common factors, so m/g must divide 
x1 — Xo. In other words, there is some number k such that 


m Seer 
— divides 


ey Sep ee 
g 


But any two solutions that differ by a multiple of m are considered to be the 
same, so there will be exactly g different solutions that are obtained by taking 
R= OG ak 

This completes our analysis of the congruence az = c (mod m). We summa- 
rize our findings in the following statement. 
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Theorem 8.1 (Linear Congruence Theorem). Let a, c, and m be integers with 
m > 1, and let g = gcd(a,m). 
(a) [fg { c, then the congruence ax = c (mod m) has no solutions. 
(b) If glc, then the congruence ax = c (mod m) has exactly g incongruent solu- 
tions. To find the solutions, first find a solution (uo, vo) to the linear equation 


au+ mv = g. 


(A method for solving this equation is described in Chapter 6.) Then 9 = 
cuo/g is a solution to ax = c (mod m), and a complete set of incongruent 
solutions is given by 


= +k (mod m) Prk =U? ea 


For example, the congruence 
9432 = 381 (mod 2576) 


has no solutions, since gcd(943, 2576) = 23 does not divide 381. On the other 
hand, the congruence 
8932 = 266 (mod 2432) 


has 19 solutions, since gcd(893, 2432) = 19 does divide 266. Notice that we are 
able to determine the number of solutions without having computed any of them. 
To actually find the solutions, we first solve 


893u — 2432u = 19. 


Using the methods from Chapter 6, we find the solution (u,v) = (79, 29). Multi- 
plying by 266/19 = 14 gives the solution 


(x,y) = (1106, 406) to the equation 893x — 2432y = 266. 
Finally, the complete set of solutions to 
8932 = 266 (mod 2432) 


is obtained by starting with c = 1106 (mod 2432) and adding multiples of the 
quantity 2432/19 = 128. (Don’t forget that if the numbers go above 2432 we are 
allowed to subtract 2432.) The 19 incongruent solutions are 


1106, 1234, 1362, 1490, 1618, 1746, 1874, 2002, 2130, 2258, 
2386, 82, 210, 338, 466, 594, 722, 850, 978. 
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Important Note. The most important case of the Linear Congruence Theorem is 
when gcd(a,m) = 1. In this case, it says that the congruence 


ax = c (mod m) (x) 
has exactly one solution. We might even write the solution as a fraction 
C 
x =-—(modm), 
a 


but if we do, then we must remember that the symbol “£ (mod m)” is really only 
a convenient shorthand for the solution to the congruence (*). 


Nonlinear congruences are also very important in number theory. As an exam- 
ple, consider the congruence 


a? +1 =0 (mod m) 


whose solutions are square roots of —1 modulo m. For some values of m such as 
m = 5 and m = 13, there are solutions, 


2741=0(mod5) and 5%+4+1=0 (mod 13), 


while for other values such as m = 3 and m = 7, there are no solutions. 

You probably already know that a polynomial of degree d with real coefficients 
has no more than d real roots.” This well-known “fact” is not true for congruences, 
since for example the congruence 


a? + x = 0 (mod 6) 


has four distinct roots modulo 6, namely 0, 2, 3, and 5. However, if we look at 
congruences modulo primes, then order and harmony are restored to the world. 
And although the statement of the following theorem may seem innocuous, we 
will see later that it is a powerful tool for proving many important results. 


Theorem 8.2 (Polynomial Roots Mod p Theorem). Let p be a prime number and 
let 

ta) = aga? + a,xt! +:-+-+4q 
be a polynomial of degree d > 1 with integer coefficients and with p { ag. Then the 
congruence 


f(x) = 0 (mod p) 


has at most d incongruent solutions. 


*In fact, the Fundamental Theorem of Algebra (see Theorem 35.1 on page 268) implies that a 
polynomial of degree d with complex coefficients always has exactly d complex roots, provided that 
you count multiple roots appropriately. 
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There are many ways to prove this important theorem, but for the sake of vari- 
ety and to introduce you to a new mathematical tool, we give a “Proof by Contra- 
diction.”* In a proof by contradiction, we begin by making a statement. We then 
use that statement to make deductions, eventually ending up with a conclusion that 
is clearly false. This allows us to deduce that the original statement was false, since 
it led to a false conclusion.* 

The particular statement with which we begin is the following: 


There exists at least one polynomial F(x) with integer 
coefficients and with leading coefficient not divisible by 
p such that the congruence F(x) = 0 (mod p) has more 
distinct roots modulo p than its degree. 


Statement: 


Now among all such polynomials, we choose one having smallest possible degree, 
say 
F(z) = Ags? + Aaa * + Ana? * > «4 Ag, 


Then we let 
T1,T2Q,-+- »Td+1 
be distinct mod p solution to the congruence 


F(x) = 0 (mod p). 


We are going to use the fact that for any value of r, the difference F(x) — F'(r) 
can be factored. To see this, we write 


F(x) — F(r) = Ao(x? — r4) + Ay (24-1 — r4-4) +--+ Ag _3(z — 1). 

Each term x? — r* has a factor of x — r, since 

ai — rt = (2 —r) (at! + op 4 pi Bp? 4. 4 ari? 4 pi-I), 
Pulling an x — r out of each term, we find that 

F(a) — F(r) = (a — r)(some messy polynomial of degree d — 1). 
In other words, there is a polynomial 

= d-1 Saat CO a 2 B 
G@)= Bor Big es Balt F.Bgk 
The classical Latin phrase for “proof by contradiction” is reductio ad absurdum, literally “re- 

duction to an absurdity.” As G.H. Hardy says in his monograph A Mathematician’s Apology, proof 
by contradiction “is one of a mathematician’s finest weapons. It is a far finer gambit than any chess 
gambit: a chess player may offer the sacrifice of a pawn or even a piece, but a mathematician offers 


the game.” 
*See page 299 for a brief discussion of the philosophy that lies behind proofs by contradiction. 
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of degree d — 1 such that 
F(z) = F(r)+ (a —1r)G(z). 


In particular, if we substitute r = r; and use the fact that F'(r;) = 0 (mod p), we 
find that 
F(x) = (a —11)G(z) (mod p). 


We have assumed that F(z) = 0 (mod p) has d + 1 distinct incongruent so- 
lutions z = r1,72,..-.,Td+1- If we substitute one of the solutions r, with k > 2 
for x, we find that 


0 = F(rg) = (re — 71) G(rx) (mod p). 


We know that r; # rz (mod p), so the Prime Divisibility Property (Theorem 7.2) 
tells us that G(r;,) = 0 (mod p). (Note that this is where we use the assumption 
that the modulus p is prime. Do you see why the argument would fall apart if the 
modulus were composite?) 

We now know that r2,73,...,7q41 are solutions to G(x) = 0 (mod p). Thus 
G(x) is a polynomial of degree d — 1 that has d distinct roots modulo p. This 
contradicts the fact that among such polynomials, the polynomial F(x) was one 
having the smallest possible degree. Hence the original statement must be false, 
which shows that there are no polynomials having more roots modulo p than their 
degree. Stated in a positive manner, we have proven that every polynomial of 
degree d has at most d roots modulo p. This completes the proof of Theorem 8.2. 


Exercises 


8.1. Suppose that a; = b; (mod m) and az = bz (mod m). 
(a) Verify that a; + a2 = b; + be (mod m) and that a; — a2 = b; — bz (mod m). 
(b) Verify that aja2 = 6, b2 (mod m). 


8.2. Suppose that 
ac = bc (mod m) 


and also assume that gcd(c, m) = 1. Prove that a = b (mod m). 


8.3. Find all incongruent solutions to each of the following congruences. 
(a) 7x =3 (mod 15) (b) 62 =5 (mod 15) 
(c) 2? =1 (mod 8) (d) «x? =2 (mod 7) 
(e) «x? =3 (mod 7) 


8.4. Prove that the following divisibility tests work. 
(a) The number a is divisible by 4 if and only if its last two digits are divisible by 4. 
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(b) The number a is divisible by 8 if and only if its last three digits are divisible by 8. 
(c) The number a is divisible by 3 if and only if the sum of its digits is divisible by 3. 
(d) The number a is divisible by 9 if and only if the sum of its digits is divisible by 9. 
(e) The number a is divisible by 11 if and only if the alternating sum of the digits of a is 
divisible by 11. (If the digits of a are a,a2a3...@g_1@q, the alternating sum means 
to take a, — a2 + a3 — --- with alternating plus and minus signs.) 
[Hint. For (a), reduce modulo 100, and similarly for (b). For (c), (d), and (e), write a as a 
sum of multiples of powers of 10 and reduce modulo 3, 9, and 11.] 


8.5. Find all incongruent solutions to each of the following linear congruences. 
(a) 82 = 6 (mod 14) 
(b) 66x = 100 (mod 121) 
(c) 21x = 14 (mod 91) 


8.6. Determine the number of incongruent solutions for each of the following congruences. 
You need not write down the actual solutions. 

(a) 722 = 47 (mod 200) 

(b) 41832 = 5781 (mod 15087) 

(c) 15372 = 2863 (mod 6731) 


8.7. == Writea program that solves the congruence 
ax =c(mod m). 


[If gcd(a, m) does not divide c, return an error message and the value of gcd(a, m).] Test 
your program by finding all of the solutions to the congruences in Exercise 8.6. 


8.8. == Write a program that takes as input a positive integer m and a polynomial f(X) 


having integer coefficients and produces as output all of the solutions to the congruence 
f(X) =0 (mod m). 


(Don’t try to be fancy. Just substitute X = 0,1,2,...m — 1 and see which values are 
solutions.) Test your program by taking the polynomial 


f{XpS ei ees 
and solving the congruence f(X) = 0 (mod m) for each of the following values of m, 
me 4130, 137, 144, 151, 158, 165, 172}. 
8.9. (a) How many solutions are there to the congruence 
X*45X? +4x* —6X —4=0(modll) with0< X < 11? 


Are there four solutions, or are there fewer than four solutions? 
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(b) Consider the congruence X? — 1 = 0 (mod 8). How many solutions does it have 
with 0 < X < 8? Notice that there are more than two solutions. Why doesn’t this 
contradict the Polynomial Roots Mod p Theorem (Theorem 8.2)? 


8.10. Let p and q be distinct primes. What is the maximum number of possible solutions 
to a congruence of the form 


x? — a = 0 (mod pq), 


where as usual we are only interested in solutions that are distinct modulo pq? 


Chapter 9 


Congruences, Powers, 
and Fermat’s Little Theorem 


Take a number a and consider its powers a,a?,a°,... modulo m. Is there any 
pattern to these powers? We will start by looking at a prime modulus m = p, 
since the pattern is easier to spot. This is a common situation in the theory of 
numbers, especially when working with congruences. So whenever you’re faced 
with discovering a congruence pattern, it’s usually a good idea to begin with a 
prime modulus. 

For each of the primes p = 3, p = 5, and p = 7, we have listed integers 
a = 0,1,2,... and some of their powers modulo p. Before reading further, you 
should stop, examine these tables, and try to formulate some conjectural patterns. 
Then test your conjectures by creating a similar table for p = 11 and seeing if your 


patterns are still true. 
ta ae asa 


Q 
Q 


4 


0] 0 
Tipe 
2| 4 
3 | 2 
4/2 
5 | 4 
Ge) 1 


a 
0 
i 
1 

a* modulo 3 


Many interesting patterns are visible in these tables. The one that we will be 
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concerned with in this chapter can be seen in the columns 
a” (mod 3), a* (mod 5), and —_a® (mod 7). 


Every entry in these columns, aside from the top one, is equal to 1. Does this 
pattern continue to hold for larger primes? You can check the table you made for 
p = 11, and you will find that 


= A(mod 11); 9 = mod 1)y 3" = anod Tih 
9'!°=1(mod11), and 10!°=1 (mod 11). 


This leads us to make the following conjecture: 
a?-! = 1 (mod p) for every integer 1 < a < p. 


Of course, we don’t really need to restrict a to be between 1 and p — 1. If a, 
and a differ by a multiple of p, then their powers will be the same modulo p. So 
the real condition on a is that it not be a multiple of p. This result was first stated 
by Pierre de Fermat in a letter to Frénicle de Bessy dated 1640, but Fermat gave 
no indication of his proof. The first known proof appears to be due to Gottfried 
Leibniz. ! 


Theorem 9.1 (Fermat’s Little Theorem). Let p be a prime number, and let a be 
any number with a £ 0 (mod p). Then 


a?-! =1 (mod p). 
Before giving the proof of Fermat’s Little Theorem, we want to indicate its 
power and show how it can be used to simplify computations. As a particular 


example, consider the congruence 
6771 (md d-23), 


This says that the number 67? — 1 is a multiple of 23. If we wanted to check this 
fact without using Fermat’s Little Theorem, we would have to multiply out 672, 
subtract 1, and divide by 23. Here’s what we get: 


622 — 1 = 23 - 5722682775750745. 


‘Gottfried Leibniz (1646-1716) is best known as one of the discoverers of the calculus. He and 
Isaac Newton worked out the main theorems of the calculus independently and at about the same 
time. The German and English mathematical communities spent the next two centuries arguing over 
who deserved priority. The current consensus is that both Leibniz and Newton should be given joint 
credit as the (independent) discoverers of the calculus. 
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Similarly, in order to verify directly that 73'°° = 1 (mod 101), we would have to 
compute 731° — 1. Unfortunately, 73!°° — 1 has 187 digits! And notice that this 
example only uses p = 101, which is a comparatively small prime. Fermat’s Little 
Theorem thus describes a very surprising fact about extremely large numbers. 

We can use Fermat’s Little Theorem to simplify computations. For example, 
in order to compute 2°° (mod 7), we can use the fact that 2 = 1 (mod 7). So we 
write 35 = 6: 5+ 5 and use the law of exponents to compute 


DOP DOO (D010 90 Sa? 0 8 A (iod 7); 


Similarly, suppose that we want to solve the congruence 710% = 4 (mod 11). 
Certainly,  # 0 (mod 11), so Fermat’s Little Theorem tells us that 


a’ =1 (mod 11). 


Raising both sides to the 10 power gives x!°° = 1 (mod 11), and then multiply- 


ing by x? gives 1193 = x? (mod 11). So, to solve the original congruence, we just 
need to solve x? = 4 (mod 11). This can be solved by trying successively x = 1, 
He ened WLLL 8 
(mod 11) | 0 | 1 Ses ee | ee || ely eleSault Oey} 10 
Senne Gallet | Sala Ron | a rob sal 6 
So the congruence 710% = 4 (mod 11) has the solution x = 5 (mod 11). 


We are now ready to prove Fermat’s Little Theorem. In order to illustrate the 
method of proof, we will first prove that 3° = 1 (mod 7). Of course, there is no 
need to give a fancy proof of this fact, since 3° — 1 = 728 = 7-104. Nevertheless, 
when attempting to understand a proof or when attempting to construct a proof, it 
is often worthwhile using specific numbers. Of course, the idea is to devise a proof 
that doesn’t really use the fact that we are considering specific numbers and then 
hope that the proof can be made to work in general. 

To prove that 3° = 1 (mod 7), we start with the numbers 


1979.455.6; 


multiply each of them by 3, and reduce modulo 7. The results are listed in the 
following table: 


x (mod 7) 1 2 3 4 5 6 
3x (mod 7) 3 6 2 5 1 4 


Notice that each of the numbers 1, 2,3,4,5,6 reappears exactly once in the second 
row. So if we multiply together all the numbers in the second row, we get the same 
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result as multiplying together all the numbers in the first row. Of course, we must 
work modulo 7. Thus, 


(3 - 1)(3- 2)(3 -3)(3 - 4)(3 - 5)(3 - 6) = 1-2-3-4-5-6 (mod 7). 
oA 
numbers in second row numbers in first row 


To save space, we use the standard symbol n! for the number n factorial, which is 
the product of 1, 2,...,. In other words, 


nl =1-2-3---(n-1)-n. 
Factoring out the six factors of 3 on the left-hand side of our congruence gives 
3° . 6! = 6! (mod 7). 


Notice that 6! is relatively prime to 7, so we can cancel the 6! from both sides. This 
gives 3° = 1 (mod 7), which is exactly Fermat’s Little Theorem. 

We are now ready to prove Fermat’s Little Theorem in general. The key ob- 
servation in our proof for 3° (mod 7) was that multiplication by 3 rearranged the 
numbers 1, 2,3, 4,5, 6 (mod 7). So first we are going to verify the following claim: 


Lemma 9.2. Let p be a prime number and let a be a number with a £ 0 (mod p). 
Then the numbers 


a, 2a,3a,...,(p-—1)a (mod p) 
are the same as the numbers 
eZ ovine inodep), 


although they may be in a different order. 


Proof. The list a, 2a, 3a,...,(~—1)a contains p — 1 numbers, and clearly none of 
them are divisible by p. Suppose that we take two numbers ja and ka in this list, 
and suppose that they happen to be congruent, 


ja = ka (mod p). 


Then p | (j — k)a, so p | (j — k), since we are assuming that p does not divide a. 
Notice that we are using the Prime Divisibility Property proved in Chapter 7, which 
says that if a prime divides a product then it divides one of the factors. On the other 
hand, we know that 1 < j,k < p—1,so|j —k| < p—1. There is only one number 
with absolute value less than p — 1 that is divisible by p and that number is zero. 
Hence, 7 = k. This shows that different multiples in the list a, 2a, 3a,...,(p—1)a 
are distinct modulo p. 
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So we now know that the list a, 2a, 3a,...,(p — 1)a contains p — 1 distinct 
nonzero values modulo p. But there are only p — 1 distinct nonzero values mod- 
ulo p, that is, the numbers 1, 2,3,..., (p—1). Hence, the list a, 2a, 3a,...,(p—l)a 
and the list 1,2,3,...,(— 1) must contain the same numbers modulo p, although 
the numbers may appear in a different order. This finishes the proof of the lemma. 

Using the lemma, it is easy to finish the proof of Fermat’s Little Theorem. The 
lemma says that the lists of numbers 


a,2a,3a,...,(9—1)a(mod p) and 1,2,3,...,(p—1) (mod p) 


are the same, so the product of the numbers in the first list is equal to the product 
of the numbers in the second list: 


a: (2a): (3a)-++((p— 1)a) =1-2-3---(p—1) (mod p). 
Next we factor our p — 1 copies of a from the left-hand side to obtain 
aP-! . (yp — 1)! = (p—1)! (mod p). 


Finally, we observe that (p — 1)! is relatively prime to p, so we may cancel it from 
both sides to obtain Fermat’s Little Theorem, 


a?-' = 1 (mod p). O 


Fermat’s Little Theorem can be used to show that a number is not a prime 
without actually factoring it. For example, it turns out that 


gi4sib06 — 899557 (mod 1234567). 


This means that 1234567 cannot be a prime, since if it were, Fermat’s Little Theo- 
rem would tell us that 2'234°°° must be congruent to 1 modulo 1234567. [If you’re 
wondering how we computed 21734°66 (mod 1234567), don’t fret; we’ll describe 
how to do it in Chapter 16.] It turns out that 1234567 = 127 - 9721, so in this case 
we can actually find a factor. But consider the number 


m = 10'° + 37. 
When we compute 2”~1 (mod m), we get 


2™-1 = 36263603275458610624877601996335839108 
36873253019151380128320824091124859463 
579459059730070231844397 (mod m). 
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Again we deduce from Fermat’s Little Theorem that 10!°° + 37 is not prime, but it 
is not at all clear how to find a factor. A quick check on a desktop computer reveals 
no prime factors less than 200,000. It is somewhat surprising that we can easily 
write down numbers that we know are composite, yet for which we are unable to 
find any factors. 


Exercises 


9.1. Use Fermat’s Little Theorem to perform the following tasks. 
(a) Find a number 0 < a < 73 witha = 9"* (mod 73). 
(b) Solve °° = 6 (mod 29). 
(c) Solve x9 = 3 (mod 13). 


9.2. The quantity (p — 1)! (mod p) appeared in our proof of Fermat’s Little Theorem, 
although we didn’t need to know its value. 
(a) Compute (p — 1)! (mod p) for some small values of p, find a pattern, and make a 
conjecture. 
(b) Prove that your conjecture is correct. [Try to discover why (p — 1)! (mod p) has the 
value it does for small values of p, and then generalize your observation to prove the 
formula for all values of p.] 


9.3. Exercise 9.2 asked you to determine the value of (p — 1)! (mod p) when p is a prime 
number. 
(a) Compute the value of (m — 1)! (mod m) for some small values of m that are not 
prime. Do you find the same pattern as you found for primes? 
(b) If you know the value of (n — 1)! (mod 7), how can you use the value to definitely 
distinguish whether n is prime or composite? 


9.4. If p is a prime number and if a ~ 0 (mod p), then Fermat’s Little Theorem tells us 
that a?—' = 1 (mod p). 
(a) The congruence 71734259 = 1660565 (mod 1734251) is true. Can you conclude that 
1734251 is a composite number? 
(b) The congruence 12964976 = 15179 (mod 64027) is true. Can you conclude that 
64027 is a composite number? 
(c) The congruence 2°7°9? = 1 (mod 52633) is true. Can you conclude that 52633 is a 
prime number? 


Chapter 10 


Congruences, Powers, 
and Euler’s Formula 


In the previous chapter we proved Fermat’s Little Theorem: If p is a prime and 
p {a, then a?~! = 1 (mod p). This formula is certainly not true if we replace p by 
a composite number. For example, 5° = 5 (mod 6) and 2° = 4 (mod 9). So we 
ask whether there is some power, depending on the modulus m, such that 


a’? =1 (mod m). 


Our first observation is that this is impossible if gcd(a,m) > 1. To see 
why, suppose that a* = 1 (mod m). Then a* = 1+ my for some integer y, so 
ecd(a,m) divides a* — my = 1. In other words, if some power of a is congruent 
to 1 modulo m, then we must have gcd(a,m) = 1. This suggests that we look at 
the set of numbers that are relatively prime to m, 


{a :1<a< mand ged(a,m) = 1}. 


For example, 


= 
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The number of integers between 1 and m that are relatively prime to m is an 
important quantity, so we give this quantity a name: 


o(m) = #{a : 1 <a < mand ged(a,m) = ne 


The function ¢ is called Euler’s phi function. From the preceding table, we can 
read off the value of ¢(m) for 1 <_m < 10. Thus 


me EVEACIEREAR SEAL SEES 


Ecol REESE EAEIEA ILE 


Notice that if p is a prime number then every integer 1 < a < pis relatively 
prime to p. So for prime numbers we have the formula 


d(p) =p—1. 


We are going to try to mimic our proof of Fermat’s Little Theorem. Suppose, 
for example, that we want to find a power of 7 that is congruent to 1 modulo 10. 
Rather than taking all the numbers 1 < a < 10, we will just take the numbers that 
are relatively prime to 10. They are 


1, 3; 7,9 (mod:10), 
If we multiply each of them by 7, we get 


7-1=7 (mod 10), 7-3 =1 (mod 10), 
7-7=9 (mod 10), 7-9 =3 (mod 10). 


Notice that we get back the same numbers, but rearranged. So if we multiply them 
together, we get the same product, 


(7-1)(7-3)(7- 7)(7-9) =1-3-7-9 (mod 10) 
74(1-3-7-9) =1-3-7-9 (mod 10). 


Now we can cancel 1 - 3-7-9 to get 7* = 1 (mod 10). 

Where does the exponent 4 come from? It’s equal to the number of integers 
between 0 and 10 that are relatively prime to 10; that is, the exponent is 4 be- 
cause ¢(10) = 4. This suggests the truth of the following formula. 


Theorem 10.1 (Euler’s Formula). If gcd(a,m) = 1, then 


a®(™) = 1 (mod m). 
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Proof. Now that we have identified the correct set of numbers to consider, the proof 
of Euler’s formula is almost identical to the proof of Fermat’s Little Theorem. So 
we let 

eS by ebe Se bag) 


be the ¢(m) numbers between 0 and m that are relatively prime to m. 


Lemma 10.2. [f gcd(a,m) = 1, then the numbers 


bia, bea, bga, ..., bg(mya (mod m) 
are the same as the numbers 


b1, ba, bs, ods b4(m) (mod m), 
although they may be in a different order. 


Proof of the lemma. We note that if 6 is relatively prime to m, then ab is also rela- 
tively prime to m. Hence, each of the numbers in the list 


bia, bea, b3a, ...,6g¢m)@ (mod m) 
is congruent to one number in the list 
b1, ba, bs, cae » Og(m) (mod m). 


Furthermore, there are ¢(m) numbers in each list. So if we can show that the 
numbers in the first list are distinct modulo m, it will follow that the two lists are 
the same (after rearranging). 
Suppose that we take two numbers b;a and b;,a from the first list, and suppose 
that they are congruent, 
b;a = bga (mod m). 


Then m|(b;—b;,)a. But m and a are relatively prime, so we find that m|b; — b;. On 
the other hand, b; and b;, are between 1 and m, which implies |b; — b,| << m — 1. 
There is only one number with absolute value strictly less than m that is divisible 
by m and that number is zero. Hence, 6; = b,. This shows that the numbers in the 
list 

ba, bea, b3a,...,bg(m)a (mod m) 
are all distinct modulo m, which completes the proof that the lemma is true. 


Using the lemma, we can easily finish the proof of Euler’s formula. The lemma 
says that the lists of numbers 


bia, bea, b3a, ... ,bg¢m)a (mod m) 


and 
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by, be, bg, ..., gm) (mod m) 


are the same, so the product of the numbers in the first list is equal to the product 
of the numbers in the second list: 


(bya) - (baa) - (bga) ++ « (bg (mya) = 01 « bg - b3 +++ bg(m) (mod m). 
We can factor out ¢(m) copies of a from the left-hand side to obtain 
a?(™) B = B (mod m), where B = bi b2b3 - - - bam): 


Finally, we observe that B is relatively prime to m, since each of the 6;’s is rela- 
tively prime to m. This means we may cancel B from both sides to obtain Euler’s 
formula 

a®(™) = 1 (mod m). O 


Exercises 


10.1. Let 6; < bz < --- < bg(m) be the integers between 1 and m that are relatively prime 
to m (including 1), and let B = 616263 - - - bg(m) be their product. The quantity B came up 
during the proof of Euler’s formula. 
(a) Show that either B = 1 (mod m) or B = —1 (mod m). 
(b) Compute B for some small values of m and try to find a pattern for when it is equal 
to +1 (mod m) and when it is equal to —1 (mod m). 


10.2. The number 3750 satisfies ¢(3750) = 1000. [In the next chapter we’ll see how 
to compute ¢(3750) with very little work.] Find a number a that has the following three 
properties: 


Ga = 799? (anod3750), 
Gi) 1 <a < 5000. 
(iii) a@ is not divisible by 7. 


10.3. A composite number m is called a Carmichael number if the congruence a”~! = 
1 (mod m) is true for every number a with gcd(a,m) = 1. 

(a) Verify that m = 561 = 3-11-17 is a Carmichael number. [Hint. It is not necessary 
to actually compute a”~' (mod m) for all 320 values of a. Instead, use Fermat’s 
Little Theorem to check that a”~! = 1 (mod p) for each prime p dividing m, and 
then explain why this implies that a”~' = 1 (mod m).] 

(b) Try to find another Carmichael number. Do you think that there are infinitely many 
of them? 


Chapter 11 


Euler’s Phi Function and the 
Chinese Remainder Theorem 


Euler’s formula 
a®(™) = 1 (mod m) 


is a beautiful and powerful result, but it won’t be of much use to us unless we 
can find an efficient way to compute the value of ¢(m). Clearly, we don’t want 
to list all the numbers from 1 to m — 1 and check each to see if it is relatively 
prime to m. This would be very time consuming if m ~ 1000, for example, and it 
would be impossible for m ~ 10!°°. As we observed in the last chapter, one case 
where $(m) is easy to compute is when m = p is a prime, since then every integer 
1<a<p- 1is relatively prime to m. Thus, ¢(p) = p — 1. 

We can easily derive a similar formula for ¢(p*) when m = p* is a power of a 
prime. Rather than trying to count the numbers between 1 and p* that are relatively 
prime to p*, we will instead start with all numbers 1 < a < p*, and then we will 
discard the ones that are not relatively prime to p*. 

When is a number a not relatively prime to p*? The only factors of p* are 
powers of p, so a is not relatively prime to p* exactly when it is divisible by p. In 
other words, 

o(p*) = p* — #{a : 1 <a <p* andp | a}. 
So we have to count how many integers between 1 and p* are divisible by p. That’s 
easy, they are the multiples of p, 


P; 2p, 3D, An, ee. (pe - 2), gee a 1)p, p*. 
There are p*—! of them, which gives us the formula 


o(p*) = p* — p**. 
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For example, 
o(2401) = o(77) = 77 — 7 = 2068. 


This means that there are 2058 integers between 1 and 2401 that are relatively 
prime to 2401. 

We now know how to compute ¢(m) when m is a power of a prime. Next 
suppose that m is the product of two primes powers, m = p’q*. To formulate a 
conjecture, we compute ¢(p’q") for some small values and compare it with the 
values of ¢(p’) and ¢(q*). 


This table suggests that 6(p’q") = ¢(p’)¢(q*). We can also try some examples 
with numbers that are not prime powers, such as 


o(14) = 6, O15) ==8, (210) = $(14- 15) = 48. 
all this leads us to guess that the following assertion is true: 
If gcd(m, n) = 1, then (mn) = o(m)¢(n). 


Before trying to prove this multiplication formula, we show how it can be used to 
easily compute ¢(m) for any m or, more precisely, for any m that you are able to 
factor as a product of primes. 

Suppose that we are given a number m, and suppose that we have factored m 
as a product of primes, say 


ki _ pk kr 
m= pi 5 es 5 
where p1,P2,---, Pr are all different. First we use the multiplication formula to 
compute 


b(m) = o(pf") - (p52) --- d(pkr). 


Then we use the prime power formula o(p* = p* — y*—! to obtain 


—_— — ae 
o(m) = (pi — p+) - (pk — pha") -- «(pr — phe), 
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This formula may look complicated, but the procedure to compute $(m) is really 
very simple. For example, 


$(1512) = $(2° - 3° -7) = 9(2°) - $(3°) - o(7) 
S222) 18" S37 a) Sas 6 = 42: 


So there are 432 numbers between 1 and 1512 that are relatively prime to 1512. 

We are now ready to prove the multiplication formula for Euler’s phi function. 
We also restate the formula for prime powers so as to have both formulas conve- 
niently listed together. 


Theorem 11.1 (Phi Function Formulas). (a) [fp isa prime and k > 1, then 
o(p*) = p* — p*. 
(b) If gcd(m,n) = 1, then (mn) = ¢(m)¢(n). 


Proof. We verified the prime power formula (a) earlier in this chapter, so we need 
to check the product formula (b). We will do this by using one of the most powerful 
tools available in number theory: 


COUNTING 


You may wonder how counting can be so powerful. After all, it’s one of the first 
things taught in kindergarten.! Briefly, we are going to find one set that con- 
tains ¢(mn) elements and a second set that contains ¢(m)(n) elements. Then 
we will show that the two sets contain the same number of elements. 

The first set is 


{a :1<a<mmnand gcd(a,mn) = 1}. 


It is clear that this set contains ¢(mn) elements, since that’s just the definition 
of ¢(mn). The second set is 


Coe 1<b< m “and: .gcd(bym) = 1 
; L<o=< mw and: iecd(é,2) = 1. J.” 


How many pairs (b,c) are in this second set? Well, there are ¢(m) choices for 6, 
since that’s the definition of ¢(m), and there are (7) choices for c, since that’s the 
definition of ¢(n). So there are ¢(m) choices for the first coordinate b and ¢(n) 


"Yet another illustration of the principle that Everything I Ever Needed To Know I Learned in 
Kindergarten, although proving theorems in number theory probably isn’t one of the basic skills that 
Robert Fulghum had in mind when he wrote his book. 
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choices for the second coordinate c; so there are a total of 6(m)¢(n) choices for 
the pair (0, c). 
For example, suppose that we take m = 4 and n = 5. Then the first set consists 
of the numbers 
fi 327 Qetieia 17,19} 


that are relatively prime to 20. The second set consists of the pairs 


tla) (12), (1,3), (1,4), (3; 2), (3, 2), (3,3), (3, 4) } 


where the first number in each pair is relatively prime to 4 and the second number 
in each pair is relatively prime to 5. 

Going back to the general case, we are going to take each element in the first 
set and assign it to a pair in the second set in the following way: 


 is@esfmn ee i: ), LSosm, gcd(b,m) = 1 
a gcd(a,mn) = 1 Me ea <<, “gedlem) = 1 
a mod mn > (a mod m,a mod n) 


What this means is that we take the integer a in the first set and send it to the 
pair (b, c) with 


a = b (mod m) and a=c(modn). 


This is probably clearer if we look again at our example with m = 4 and n = 5. 
Then, for example, the number 13 in the first set gets sent to the pair (1,3) in the 
second set, since 13 = 1 (mod 4) and 13 = 3 (mod 5). We do the same for each 
of the other numbers in the first set. 


(3; 1), (3, 2), ,3), (3, 4)} 
eee eae, (i734) 
35 =-(33) 13> (1,3) 
Ti (32) 1714.2) 
9 +> (1,4) 19 +> (3,4) 


In this example, you can see that each pair in the second set is matched with exactly 
one number in the first set. This means that the two sets have the same number of 
elements. We want to check that the same matching occurs in general. 

We need to check that the following two statements are correct: 


1. Different numbers in the first set get sent to different pairs in the second set. 
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2. Every pair in the second set is hit by some number in the first set. 


Once we verify these two statements, we will know that the two sets have the 
same number of elements. But we know that the first set has ¢(mn) elements 
and the second set has ¢(m)(n) elements. So in order to finish the proof that 
o(mn) = o(m)¢(n), we just need to verify (1) and (2). 

To check (1), we take two numbers a, and az in the first set, and we suppose 
that they have the same image in the second set. This means that 


a, = a2 (mod m) and a, = az (mod n). 


Thus, a1 — a2 1s divisible by both m and n. However, m and n are relatively prime, 
SO @1 — a2 must be divisible by the product mn. In other words, 


a, = az (mod mn), 


which shows that a; and a are the same element in the first set. This completes 
our proof of statement (1). 

To check statement (2), we need to show that for any given values of b and c 
we can find at least one integer a satisfying 


a = b (mod m) and a =c(mod n). 
The fact that these simultaneous congruences have a solution is of sufficient im- 
portance to warrant having its own name. 


Theorem 11.2 (Chinese Remainder Theorem). Let m and n be integers satisfying 
gcd(m,n) = 1, and let b and c be any integers. Then the simultaneous congru- 
ences 

x = b (mod m) and = x=c(modn) 


have exactly one solution with 0 < x < mn. 


Proof. Let’s start, as usual, with an example. Suppose we want to solve 
x = 8 (mod 11) and x =3(mod 19). 


The solution to the first congruence consists of all numbers that have the form 
x =1ly+ 8. We substitute this into the second congruence, simplify, and try to 
solve. Thus, 


lly +8 =3 (mod 19) 
lly = 14 (mod 19). 
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We know how to solve linear congruences of this sort (see the Linear Congruence 
Theorem in Chapter 8). The solution is y; = 3 (mod 19), and then we can find 
the solution to the original congruences using 7; = lly; +8 =11-3+8 = 41. 
Finally, we should check our answer: (41 — 8)/11 = 3 and (41 — 3)/19 =2.V 

For the general case, we again begin by solving the first congruence x = 
b (mod m). The solution consists of all numbers of the form x = my + b. We 
substitute this into the second congruence, which yields 


my =c—b(modn). 


We are given that gcd(m,n) = 1, so the Linear Congruence Theorem of Chapter 8 
tells us that there is exactly one solution y; with 0 < y; < n. Then the solution to 
the original pair of congruences is given by 


ry = my, + 6; 


and this will be the only solution x; with 0 < x1; < mzn, since there is only 
one y; between 0 and n, and we multiplied y; by m to get x,. This completes our 
proof of the Chinese Remainder Theorem and, with it, our proof of the formula 


o(mn) = o(m)o(n). O 


Historical Interlude. The first recorded instance of the Chinese Remainder The- 
Orem appears in a Chinese mathematical work from the late third or early fourth 
century. Somewhat surprisingly, it deals with the harder problem of three simulta- 
neous congruences. 


““We have a number of things, but we do not know exactly how many. 
If we count them by threes, we have two left over. If we count them 
by fives, we have three left over. If we count them by sevens, we have 
two left over. How many things are there?” 

Sun Tzu Suan Ching (Master Sun’s Mathematical Manual) 

Circa AD 300, volume 3, problem 26. 


Exercises 


11.1. (a) Find the value of $(97). 
(b) Find the value of (8800). 


11.2. (a) If m > 3, explain why $(m) is always even. 
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(b) ¢(m) is “usually” divisible by 4. Describe all the m’s for which ¢(m) is not divisible 
by 4. 


11.3. Suppose that p1,p2,...,p, are the distinct primes that divide m. Show that the 
following formula for ¢(m) is correct. 


1 1 1 
6m) =m (1-) (1-—).. (4-2). 
Pi P2 Pr 
Use this formula to compute (1000000). 


11.4. & Writea program to compute ¢(7), the value of Euler’s phi function. You should 
compute $(7) by using a factorization of n into primes, not by finding all the a’s between 1 
and n that are relatively prime to n. 


11.5. For each part, find an x that solves the given simultaneous congruences. 
(a)  =3 (mod 7) and z = 5 (mod 9) 
(b) z =3 (mod 37) and z = 1 (mod 87) 
(c) z = 5 (mod 7) and x = 2 (mod 12) and x = 8 (mod 13) 


11.6. Solve the 1700-year-old Chinese remainder problem from the Sun Tzu Suan Ching 
stated on page 80. 


11.7. A farmer is on the way to market to sell eggs when a meteorite hits his truck and 
destroys all of his produce. In order to file an insurance claim, he needs to know how many 
eggs were broken. He knows that when he counted the eggs by 2’s, there was 1 left over, 
when he counted them by 3’s, there was 1 left over, when he counted them by 4’s, there 
was | left over, when he counted them by 5’s, there was 1 left over, and when he counted 
them by 6’s, there was 1 left over, but when he counted them by 7’s, there were none left 
over. What is the smallest number of eggs that were in the truck? 


11.8. & Write a program that takes as input four integers (b, m, c,n) with gcd(m,n) = 
1 and computes an integer x with 0 < x < mn satisfying 


x = b (mod m) and xz =c(modn). 


11.9. In this exercise you will prove a version of the Chinese Remainder Theorem for three 
congruences. Let m1, ™m2,™3 be positive integers such that each pair is relatively prime. 
That is, 


gcd(m,,m2)=1 and gced(mi,m3)=1 and ged(me2,ms3) = 1. 


Let a1, G2, a3 be any three integers. Show that there is exactly one integer x in the interval 
0 < x < m,mM2mMz that simultaneously solves the three congruences 


x =a, (mod m™), £ = a2 (mod mg), > = a3 (mod m3). 
Can you figure out how to generalize this problem to deal with lots of congruences 
x =a, (mod ™)), £ = az (mod mg),..., x =a, (mod m,)? 


In particular, what conditions do the moduli m1, ™2,...,™m, need to satisfy? 
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11.10. What can you say about n if the value of (7) is a prime number? What if it is the 
square of a prime number? 


11.11. (a) Find at least five different numbers n with ¢(n) = 160. How many more can 
you find? 
(b) Suppose that the integer n satisfies ¢(n) = 1000. Make a list of all of the primes that 
might possibly divide n. 
(c) Use the information from (b) to find all integers n that satisfy 6(n) = 1000. 


11.12. Find all values of n that solve each of the following equations. 
(a) o(n) =n/2 (b) ¢g(n)=n/3 (c) ¢g(n) =n/6 


[Hint. The formula in Exercise 11.3 might be useful.] 


11.13. (a) For each integer 2 < a < 10, find the last four digits of a!°°. 
(b) Based on your experiments in (a) and further experiments if necessary, give a simple 
criterion that allows you to predict the last four digits of a!°°° from the value of a. 
(c) Prove that your criterion in (b) is correct. 


Chapter 12 


Prime Numbers 


Prime numbers are the basic building blocks of number theory. That’s what the 
Fundamental Theorem of Arithmetic, discussed in Chapter 7, tells us. Every num- 
ber is built up in a unique fashion by multiplying together prime numbers. There 
are analogous situations in other areas of science, and without exception the dis- 
covery and description of the building blocks has had a profound effect on its dis- 
cipline. For example, the field of chemistry was revolutionized by the discovery 
that every chemical is formed from a few basic elements and by Mendeleev cat- 
aloging these elements into families whose properties recur periodically. We will 
do something similar below when we split the set of prime numbers into various 
subsets, for example, into the set congruent to 1 modulo 4 and the set congruent 
to 3 modulo 4. Similarly, a tremendous advance in physics occurred when scien- 
tists discovered that the atoms comprising every element are made up of three basic 
particles, protons, neutrons, and electrons,! and that the number of each determines 
the chemical and physical attributes of the atom. For example, an atom made up 
of 92 protons and only 143 neutrons has properties that clearly distinguish it from 
its cousin with three additional neutrons. 

The fact that prime numbers are basic building blocks is sufficient reason to 
study their properties. Of course, this doesn’t imply that those properties will be 
interesting. Studying how to conjugate irregular verbs is important when learning 
a language, but that doesn’t make it very appealing. Luckily, the more one stud- 
ies prime numbers, the more interesting they become, and the more beautiful and 
surprising become the relationships that one discovers. In this brief chapter we 
will only have time to mention a few of the many remarkable properties of prime 
numbers. 


'This description of an atom is a simplification, but it is a fairly accurate portrayal of the original 
atomic theories advanced in the early part of the twentieth century. 
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To begin with, let’s list the first few primes: 
2,05 Ops, hs Los 195 23,29,.31.37, 4) AR Ay bo: OOP 6 Lk haa 


What can we glean from this list? First, it looks like 2 is the only even prime. This 
is true, of course. If n is even and larger than 2 then it factors asn = 2-(n/2). This 
makes 2 somewhat unusual among the set of primes, so people have been known 
to say that 


“2 is the oddest prime!” 


A more important observation from our list of primes is signified by the ellipsis 
(three dots) appended at the end. This means that the list is not complete. For 
example, 67 and 71 are the next two primes. However, the real issue is whether 
the list ends or whether it continues indefinitely. In other words, are there infinitely 
many prime numbers? The answer is yes. We now give a beautiful proof that 
appeared in Euclid’s Elements more than 2000 years ago. 


Theorem 12.1 (Infinitely Many Primes Theorem). There are infinitely many prime 
numbers. 


Euclid’s Proof. Suppose that you have already compiled a (finite) list of primes. I 
am going to show you how to find a new prime that isn’t in your list. Since you can 
then add the new prime to the list and repeat the process, this will show that there 
must be infinitely many primes. 

So suppose we start with some list of primes p1, p2,..., Pr. We multiply them 
together and add 1, which gives the number 


A= pipo:::Dr +1. 


If A itself is prime, we’re done, since A is too large to be in the original list. But 
even if A is not prime, it will certainly be divisible by some prime, since every 
number can be written as a product of primes. Let g be some prime dividing A, for 
example, the smallest one. I claim that q is not in the original list, so it will be the 
desired new prime. 

Why isn’t q in the original list? We know that qg divides A, so 


q divides p1p2---pr +1. 


If qg were to equal one of the p,’s, then it would have to divide 1, which is not 
possible. This means that gq is a new prime that may be added to our list. Repeating 


*Naturally, I would never even consider repeating such a weak joke! Notice that this is one of 
those jokes that is language specific. For example, it doesn’t work in French, since an odd number is 
impair, while an odd person or event is étrange or bizarre. 
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this process, we can create a list of primes that is as long as we want. This shows 
that there must be infinitely many prime numbers. L 


Euclid’s proof is very clever and beautiful. We will illustrate the ideas in Eu- 
clid’s proof by using them to create a list of primes. We start with a list consisting 
of the single prime {2}. Following Euclid, we compute A = 2+ 1 = 3. This A 
is already prime, so we append it to our list. Now we have two primes, {2,3}. 
Again using Euclid’s argument, we compute A = 2-3+1 = 7, and again A 
is prime and can be added to the list. This gives three primes, {2,3,7}. Re- 
peating the argument gives A = 2-3-7+41 = 48, another prime! So now 
our list has four primes, {2,3, 7,43}. Into the breach once more, we compute 
A =2-3-7-43+1 = 1807. This time, A is not prime, it factors as A = 13 - 139. 
We add 13 to our list, which now reads {2, 3, 7,43, 13}. One more time, we com- 
pute A = 2-3-7-43-13+1 = 23479. This A also factors, A = 53 - 443. This 
gives the list {2, 3, 7,43, 13, 53}, and we will stop here. But in principle we could 
continue this process to produce a list of primes of any specified length. 

We now know that the list of primes continues without end, and we also ob- 
served that 2 is the only even prime. Every odd number is congruent to either 1 
or 3 modulo 4, so we might ask which primes are congruent to 1 modulo 4 and 
which are congruent to 3 modulo 4. This separates the set of (odd) primes into 
two families, just as the periodic table separates the elements into families having 
similar properties. In the following list, we have boxed the primes congruent to 1 
modulo 4: 


3,15], 7, 11,[13}[17], 19, 23,[29], 31,[37],[41], 43, 47,[53 |, 59, 
[61 |, 67, 71,|73], 79, 83,| 89 | |97],[ 101]... 


There doesn’t seem to be any obvious pattern, although there do seem to be plenty 
of primes of each kind. Here’s a longer list. 


p=1(mod4) 5, 13,17,29,37, 41,53, 61, 73, 89, 97, 101, 109, 
113.197 AAG: 1575173, 181-1193, 19 7%, >: 

p=3(mod4) 3,7, 11,19,23,31, 43, 47, 59, 67, 71, 79, 83, 103, 
107, 127, 131, 139, 151, 163, 167, 179,... 


Is it possible that one of the lines in this list eventually stops, or are there 
infinitely many primes in each family? It turns out that each line continues indef- 
initely. We will use a variation of Euclid’s proof to show that there are infinitely 
many primes congruent to 3 modulo 4. In Chapter 21 we use a slightly different 
argument to deal with the 1 modulo 4 primes. 
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Theorem 12.2 (Primes 3 (Mod 4) Theorem). There are infinitely many primes that 
are congruent to 3 modulo 4. 


Proof. We suppose that we have already compiled a (finite) list of primes, all of 

which are congruent to 3 modulo 4. Our goal is to make the list longer by finding 

a new 3 modulo 4 prime. Repeating this process gives a list of any desired length, 

thereby proving that there are infinitely many primes congruent to 3 modulo 4. 
Suppose that our initial list of primes congruent to 3 modulo 4 is 


ey 7 ee 


Consider the number 
A= 4pipy--pp +3. 


(Notice that we don’t include the prime 3 in the product.) We know that A can be 
factored into a product of primes, say 


A= 192°**s- 


I claim that among the primes q1, g2,..., @s at least one of them must be congruent 
to 3 modulo 4. This is the key step in the proof. Why is it true? Well, if not, 
then q1, 2,-.--,@s would all be congruent to 1 modulo 4, in which case their prod- 
uct A would be congruent to 1 modulo 4. But you can see from its definition that A 
is clearly congruent to 3 modulo 4. Hence, at least one of q1,q2,...,@s5 must be 
congruent to 3 modulo 4, say gq; = 3 (mod 4). 

My second claim is that q; is not in the original list. Why not? Well, we 
know that gq; divides A, while it is clear from the definition of A that none of 
3,1, P2,---,Pr divides A. Thus, q; is not in our original list, so we may add 
it to the list and repeat the process. In this way we can create as long a list as 
we want, which shows that there must be infinitely many primes congruent to 3 
modulo 4. O 


We can use the ideas in the proof of the Primes 3 (Mod 4) Theorem to create 
a list of primes congruent to 3 modulo 4. We need to start with a list containing 
at least one such prime, and remember that 3 is not allowed in our list. So we 
start with the list consisting of the single prime {7}. We compute A = 4-7 + 
3 = 31. This A is itself prime, so it is a new 3 (mod 4) prime to add to our 
list. The list now reads {7,31}, so we compute A = 4-7-31+4 3 = 871. This A 
is not prime; it factors as A = 13-67. The proof of the theorem tells us that at 
least one of the prime factors will be congruent to 3 modulo 4. In this case, the 
prime 67 is 3 (mod 4), so we add it to our list. Next we take {7, 31,67}, compute 
A=4.-7-31-67+ 3 = 58159, and factor itas A = 19 - 3061. This time it is the 
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first factor 19 that is 3 (mod 4), so our list becomes {7, 31, 67, 19}. We will repeat 
the process one more time. So 


A=4-7-31-67-19+3 = 1104967 = 179 - 6173, 


which gives the prime 179 to add to the list, {7, 31,67, 19, 179}. 

Why won’t the same idea work for 1 (mod 4) primes? This is not an idle 
question; it’s almost as important to understand the limitations of an argument as 
it is to understand why the argument is valid. So suppose we try to create a list 
of 1 (mod 4) primes. If we start with the list {p1,p2,...,p,}, we can compute 
the number A = 4p,p2---p, + 1, factor it, and try to find a prime factor that is 
a new 1 (mod 4) prime. What happens if we start with the list {5}? We compute 
A=4-5+4+1=21=3-7, and neither of the factors 3 or 7 is a 1 (mod 4) num- 
ber. So we’re stuck. The problem is that it is possible to multiply two 3 (mod 4) 
numbers, such as 3 and 7, and end up with a 1 (mod 4) number like A = 21. In 
general, we cannot use the fact that A = 1 (mod 4) to deduce that some prime fac- 
tor of A is 1 (mod 4), and that’s why this proof won’t work for primes congruent 
to 1 modulo 4. 

There is no particular reason to consider only congruences modulo 4. For 
example, every number is congruent to either 0, 1, 2, 3, or 4 modulo 5; and except 
for 5 itself, every prime number is congruent to one of 1, 2, 3, or 4 modulo 5. 
(Why?) So we can break up the set of prime numbers into four families, depending 
on their congruence class modulo 5. Here’s a list of the first few numbers in each 
family: 


in Pus Amelie 101 ie 15h ie 1o1 oN 240 

2 ( jo 2 GALT 87 AO T OTE 197: 137) 157 167197 
=3(mod5) 3,13, 23, 43, 53, 73, 83, 103, 113, 163, 173, 193, 223 

4 ( ) 19,29, 59,79, 89, 109, 139, 149, 179, 199, 229, 239 


Again there seem to be lots of primes in each family, so we might guess that each 
contains infinitely many prime numbers. 

In general, if we fix a modulus m and a number a, when might we expect 
there to be infinitely many primes congruent to a modulo m? There is one sit- 
uation in which this cannot happen, that is if a and m have a common factor. 
For example, suppose that p is a prime and that p = 35 (mod 77). This means 
that p = 35 + 77y = 7(5 + 11y), so the only possibility is p = 7, and even p = 7 
doesn’t work. Generally, if p is a prime satisfying p = a (mod m), then gcd(a, m) 
divides p. So either gcd(a,m) = 1 or else gcd(a,m) = p, which means there is at 
most one possibility for p. Thus, it is really only interesting to ask about primes 
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congruent to a modulo m if we assume that gcd(a,m) = 1. A famous theorem of 
Dirichlet from 1837 says that with this assumption there are always infinitely many 
primes congruent to a modulo m. 


Theorem 12.3 (Dirichlet’s Theorem on Primes in Arithmetic Progressions*). Let a 
and m be integers with gcd(a,m) = 1. Then there are infinitely many primes that 
are congruent to a modulo m. That is, there are infinitely many prime numbers p 
satisfying 

p =a (mod m). 


Earlier in this chapter we proved Dirichlet’s Theorem for (a,m) = (3,4), and 
Exercise 12.2 asks you to do (a,m) = (5,6). In Chapter 21, we will deal with 
(a,m) = (1,4). Unfortunately, the proof of Dirichlet’s Theorem for all (a,™) is 
quite complicated, so we will not be able to give it in this book. The proof uses 
advanced methods from calculus and, in fact, calculus with complex numbers! 


Exercises 


12.1. Start with the list consisting of the single prime {5} and use the ideas in Euclid’s 
proof that there are infinitely many primes to create a list of primes until the numbers get 
too large for you to easily factor. (You should be able to factor any number less than 1000.) 


12.2. (a) Show that there are infinitely many primes that are congruent to 5 modulo 6. 
[Hint. Use A = 6p1p2---p, + 5.] 
(b) Try to use the same idea (with A = 5p1p2--- p, + 4) to show that there are infinitely 
many primes congruent to 4 modulo 5. What goes wrong? In particular, what happens 
if you start with {19} and try to make a longer list? 


12.3. Let p be an odd prime number. Write the quantity 


1 ‘ease Geel 1 
Mg ae ag Oe aera 
as a fraction A, /B, in lowest terms. 
(a) Find the value of A, (mod p) and prove that your answer is correct. 
(b) Make a conjecture for the value of A, (mod De) 
(c) Prove your conjecture in (b). (This is quite difficult.) 


An arithmetic progression is a list of numbers with a common difference. For example, 2, 7, 
12, 17, 22, ...is an arithmetic progression with common difference 5. The numbers congruent to a 
modulo m form an arithmetic progression with common difference m, which explains the name of 
Dirichlet’s Theorem. 
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12.4, Let m be a positive integer, let a1, @2,...,@¢(m) be the integers between 1 and m 
that are relatively prime to m, and write the quantity 


1 1 1 1 
ay a2 a3 Ag(m) 


as a fraction A,,/B,,, in lowest terms. 
(a) Find the value of A,, (mod m) and prove that your answer is correct. 
(b) Generate some data for the value of A,, (mod me) try to find patterns, and then 


try to prove that the patterns you observe are true in general. In particular, when is 
Am = 0 (mod m?)? 


12.5. Recall that the number n factorial, which is written n!, is equal to the product 
nl =1-2-3---(n-—I1)-n. 


(a) Find the highest power of 2 dividing each of the numbers 1!, 2!, 3!,..., 10!. 

(b) Formulate a rule that gives the highest power of 2 dividing n!. Use your rule to 
compute the highest power of 2 dividing 100! and 1000!. 

(c) Prove that your rule in (b) is correct. 

(d) Repeat (a), (b), and (c), but this time for the largest power of 3 dividing n!. 

(e) Try to formulate a general rule for the highest power of a prime p that divides n!. Use 
your rule to find the highest power of 7 dividing 1000! and the highest power of 11 
dividing 5000!. 

(f) Using your rule from (e) or some other method, prove that if p is prime and if p™ 
divides n! then m < n/(p — 1). (This inequality is very important in many areas of 
advanced number theory.) 


12.6. (a) Find a prime p satisfying p = 1338 (mod 1115). Are there infinitely many such 
primes? 

(b) Find a prime p satisfying p = 1438 (mod 1115). Are there infinitely many such 
primes? 


Chapter 13 


Counting Primes 


How many prime numbers are there? We have already given the answer that there 
are infinitely many. Of course, there are also infinitely many composite numbers. 
Which are there more of, primes or composites? Despite the fact that there are 
infinitely many of each, we can compare them by using a counting function. 

First, let’s start with an easier question that will illustrate the underlying idea. 
Our intuition says that approximately half of all numbers are even. We can put this 
intuition onto firmer ground by looking at the even number counting function: 


E(x) = #{even numbers n with 1 < n < z}. 


This function counts how many even numbers there are less than or equal to x. For 
example, 


E(3)=1, E(4)=2, E(5) =2, 
E(100) =50, (101) =50,.... 


To study what fraction of all numbers are even, we should look at the ratio F(x) /z. 


Thus, 
1 E(4) 1 ES) 2 
3 3h 4 2) 5 Be 
E(100) aa F(101) _ 90 


100° ~~ 9? foie. > woe? 
1 


It is certainly not true that the ratio E(x)/z is always equal to 5, but it is true that 
when z is large E(x) /x will be close to 4. If you have taken a little bit of calculus, 
you will recognize that we are trying to say that 


E ii 
lim (2) == 


LZ 0 6. Oe 
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This statement! just means that as x gets larger and larger the distance between 
E(a)/x and = gets closer and closer to 0. 

Now let’s do the same thing for prime numbers. The counting function for 
prime numbers is called 7(x), where “x” is an abbreviation for “prime.” (This use 
of the Greek letter 7 has nothing to do with the number 3.14159... .) Thus 

(x) = #{primes p with p < x} 


For example, 7(10) = 4, since the primes less than 10 are 2, 3, 5, and 7. Similarly, 
the primes less than 60 are 


2, 3,9, 7, 11, 13,17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 


so 7(60) = 17. Here’s a short table giving the values of 7(z) and the ratio 7(x)/x. 


= [20 | [a0 [200 | 200 [ sto 2000 | 500 
elias ee ee 


It certainly looks like the ratio 7(x)/z is getting smaller and smaller as x gets 
larger. Assuming that this pattern continues, we would be justified in saying that 
“most numbers are not prime.” This raises the further question of just how rapidly 
m(x)/a decreases. The answer is provided by the following celebrated result, 
which is one of the pinnacles of nineteenth-century number theory. 


Theorem 13.1 (The Prime Number Theorem). When «x is large, the number of 
primes less than x is approximately equal to x/\n(x). In other words, 


w(x) 
ne =" 


The quantity In(2), which is called the natural logarithm of x, is the logarithm 
of x to the base e = 2.7182818... .2 Here is a table that compares the values 


'This mathematical statement is read “the limit, as x goes to infinity, of E(x) /x is equal to 1/2.” 

“If you are not familiar with natural logarithms, you can just think of In(x) as being approxi- 
mately equal to 2.30259 log(x), where log(z) is the usual logarithm to the base 10. The natural 
logarithm is so important in mathematics and science that most scientific calculators have a special 
button to compute it. The natural logarithm appears “naturally” in problems involving compound 
growth, such as population growth, interest payments, and decay of radioactive materials. It is a 
wonderful fact that this widely applicable function also appears in the purely mathematical problem 
of counting prime numbers. 
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of z(xz) and z/ In(z). 


168; ||| 1229 50847534 
144.76) 1085.74 |72382.41 | 48254942.43 
0.921)1.151) 1.161 | 1.132 1.054 


By examining similar, but shorter, tables around 1800, Carl Friedrich Gauss and 
Adrien-Marie Legendre independently were led to conjecture that the Prime Num- 
ber Theorem should be true. Almost a century passed before a proof was found. 
In 1896 Jacques Hadamard and Ch. de la Vallée Poussin each managed to prove the 
Prime Number Theorem. Just as with Dirichlet’s Theorem, the proof uses meth- 
ods from complex analysis (i.e., calculus with complex numbers). More recently, 
in 1948, Paul Erdés and Atle Selberg found an “elementary” proof of the Prime 
Number Theorem. Their proof is elementary in the sense that it does not require 
methods from complex analysis, but it is by no means easy, so we are not able to 
present it here. 

It is somewhat surprising that to prove theorems about whole numbers, such 
as Dirichlet’s Theorem and the Prime Number Theorem, mathematicians have to 
use tools from calculus. An entire branch of mathematics called Analytic Number 
Theory is devoted to proving theorems in number theory using calculus methods. 

There are many famous unsolved problems involving prime numbers. We con- 
clude this chapter by describing three such problems with a little bit of their history. 


Conjecture 13.2 (Goldbach’s Conjecture). Every even number n > 4 is a sum of 
two primes. 


Goldbach proposed this conjecture to Euler in a letter dated June 7, 1742. It is 
not hard to check that Goldbach’s Conjecture is true for the first few even numbers. 
Thus, 


A=942, 6=343, 8=345, 10=3+47, 129=547, 
14=3411, 16=34 13, 18=5413) 20=7+ 18... 


This verifies Goldbach’s Conjecture for all even numbers up to 20. Using comput- 
ers, Goldbach’s conjecture has been checked for all even numbers up to 2 - 101°. 
Even better, mathematicians have been able to prove results that are similar to 
Goldbach’s Conjecture. These suggest that Goldbach’s Conjecture is also true. 
One such theorem was proved by I.M. Vinogradov in 1937. He showed that every 
(sufficiently large) odd number n is a sum of three primes. A second theorem, 
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proved by Chen Jing-run in 1966, says that every (sufficiently large) even number 
is a sum of two numbers p + a, where p is a prime number and a is either prime or 
a product of two primes. 


Conjecture 13.3 (The Twin Primes Conjecture). There are infinitely many prime 
numbers p such that p + 2 is also prime. 


The list of prime numbers is quite irregular, and there are often very large gaps 
between consecutive primes. For example, there are 111 composite numbers fol- 
lowing the prime 370,261. On the other hand, there seem to be quite a few instances 
in which a prime p is followed almost immediately by another prime p+ 2. (Of 
course, p + 1 cannot be prime, since it is even.) These pairs are called twin primes, 
and the Twin Primes Conjecture says that the list of twin primes should never end. 
The first few twin primes are 


(35), (5,7);-C11, 13); 175.19), (29:31), (41, 43), (69,61); (71, 73), 
(101, 103), (107, 109), (137, 139), (149,151), (179, 181), (191, 193), 
(197,199), (227, 229), (239, 241), (269,271), (281,283), (311, 313). 


Just as with Goldbach’s Conjecture, people have used computers to compile long 
lists of twin primes, including, for example, the tremendous pair consisting of 


FADIOGOSS 2 1 and 242206083 - 238880 + 1, 


As further evidence for the validity of the conjecture, Chen Jing-run proved in 1966 
that there are infinitely many primes p such that p + 2 is either a prime or a product 
of two primes. 


Conjecture 13.4 (The N? +1 Conjecture). There are infinitely many primes of the 
form N? +1. 


If N is odd, then N? + 1 is even, so it cannot be prime (unless N = 1). 
However, if N is even, then N? + 1 seems frequently to be prime. The N? + 1 
Conjecture says that this should happen infinitely often. The first few primes of 
this form are 


Pass Aa 17, 6A] 37. 10? 21 = 101, 
(Ate S107, 1621 957, 202 1 =] 401, 242 = 577, 
267 +1=677, 367+1=1297, 407+1 = 1601. 
The best result currently known was proved by Henryk Iwaniec in 1978. He 


showed that there are infinitely many values of N for which N? + 1 is either prime 
or a product of two primes. 
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Although no one knows if there are infinitely many twin primes or infinitely 
many primes of the form NV? +1, mathematicians have guessed what their counting 
functions should look like. Let 


T(x) = #{primes p < x such that p + 2 is also prime}, 
S(x) = #{primes p < x such that p has the form N? + 1}. 


Then it is conjectured that 


meee fad Cee aa 
zoo /(Inx)? zoo 4/z/Inx 

The numbers C’ and C” are a bit complicated to describe precisely. For example, C 
is approximately equal to 0.66016. 


Exercises 


13.1. (a) Explain why the statement “one-fifth of all numbers are congruent to 2 mod- 
ulo 5” makes sense by using the counting function 


F(x) = #{positive numbers n < x satisfying n = 2 (mod 5)}. 


(b) Explain why the statement “most numbers are not squares” makes sense by using the 
counting function 


S(a) = #{square numbers less than x}. 
Find a simple function of x that is approximately equal to S(x) when z is large. 


13.2. (a) Check that every even number between 70 and 100 is a sum of two primes. 
(b) How many different ways can 70 be written as a sum of two primes 70 = p+ q with 
p <q? Same question for 90? Same question for 98? 


13.3. The number 7! (n factorial) is the product of all numbers from 1 to n. For example, 
4{=1-2-3-4= 24 and 71 =1-2-3-4-5-6-7 = 5040. If m > 2, show that all the 
numbers 

m+2, nt+3, nt+4, ..., ni+(n-1), niin 


are composite numbers. 


13.4. (a) Do you think there are infinitely many primes of the form N? + 2? 
(b) Do you think there are infinitely many primes of the form N? — 2? 
(c) Do you think there are infinitely many primes of the form N? + 3N + 2? 
(d) Do you think there are infinitely many primes of the form N? + 2N + 2? 
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13.5. The Prime Number Theorem says that the number of primes smaller than x is approx- 
imately x/\In(a). This exercise asks you to explain why certain statements are plausible. 
So do not try to write down formal mathematical proofs. Instead, explain as convincingly 
as you can in words why the Prime Number Theorem makes each of the following state- 
ments reasonable. 
(a) If you choose a random integer between 1 and z, then the probability that you chose 
a prime number is approximately 1/In(z). 
(b) If you choose two random integers between 1 and z, then the probability that both of 
them are prime numbers is approximately 1/(In x)?. 
(c) The number of twin primes between 1 and x should be approximately x/(Inx)?. 
[Notice that this explains the conjectured limit formula for the twin prime counting 
function T'(x).] 


13.6. (This exercise is for people who have taken some calculus.) The Prime Number The- 
orem says that the counting function for primes, 7(), is approximately equal to x/ In(z) 
when x is large. It turns out that 7(x) is even closer to the value of the definite integral 


J, dt/In(t). 
ts. Lae) / Cae) =? 


(a) Show that 
This means that [> dt/In(t) and x/In(x) are approximately the same when z is 
large. [Hint. Use L’HOpital’s rule and the Second Fundamental Theorem of Calcu- 
lus.] 

(b) It can be shown that 


(In(t))? (In(t))?_ (In(t))4 
mop | UaBe | oad 


la = In(In(t)) + In(t) + ape e 


Use this series to compute numerically the value of [> dt/In(t) for « = 10, 100, 
1000, 10*, 10°, and 10°. Compare the values you get with the values of (zx) 
and 2/In(a) given in the table on page 92. Which is closer to 7(x), the integral 
J; dt/In(t) or the function x/ In(x)? (This problem can be done with a simple cal- 
culator, but you’ll probably prefer to use a computer or programmable calculator.) 

(c) Differentiate the series in (b) and show that the derivative is actually equal to 1/ In(t). 
[Hint. Use the series for e”.] 


Chapter 14 


Mersenne Primes 


In this chapter we will study primes that can be written in the form a” — 1 with 
n > 2. For example, 31 is such a prime, since 31 = 2° — 1. The first step is to look 
at some data. 


9*#—1=3.-5 2°_1=31 
34 — 3? SS De 


An easy observation is that if a is odd then a” — 1 is even, so it cannot be prime. 
Looking at the table, we also see that it appears that a” — 1 is always divisible by 
a—1. This observation is indeed true. We can prove that it is true by using the 
famous formula for the sum of a geometric series: 


a’ —1=(4-1)(c™ |4+a"%4.---4+2%+4+241). Geometric Series 


To check this Geometric Series formula, we multiply out the product on the 
right. Thus, 


(p=) ag eg ag a) 
=g- (a te? “4... e*+e4+1) 
—1-(2% 14a" 2 4.--+a%+241) 
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=(c? tet 4 aa x? 4 x 4 a) 
(ah ao Cee ep eel) 


=I, 


since all the other terms cancel. 

Using the Geometric Series formula with x = a, we see immediately that 
a” — 11s always divisible by a — 1. So a” — 1 will be composite unless a — 1 = 1, 
that is, unless a = 2. 

However, even if a = 2, the number 2” — 1 is frequently composite. Again we 
look at some data: 


lear Ane S| eG ie 8 10 
gee elles ra tog oes ll ator ol oce 


Even this short table suggests the following: 


When 7 is even, 2” — 1 is divisible by 3 = Ja. 
When n is divisible by 3, 2” — 1 is divisible by 7 = 2° — 1. 
When n is divisible by 5, 2” — 1 is divisible by 31 = 2° — 1. 


So we suspect that if n is divisible by m, then 2” — 1 will be divisible by 2” — 1. 

Having made this observation, it is easy to verify that it is true. So suppose 
that n factors as n = mk. Then 2” = 2™* — (2™)*, We use the Geometric Series 
formula with x = 2”” to obtain 


2” —1 = (2™)*—1 = (2™—1)((2™)*-* + (2™)*-? +--+ (2)? + (2) +1). 


This shows that if n is composite then 2” — 1 is composite. We have verified the 
following fact. 


Proposition 14.1. [f a” — 1 is prime for some numbers a > 2 and n > 2, then a 
must equal 2 and n must be a prime. 


This means that if we are interested in primes of the form a” — 1 we only need 
to consider the case that a = 2 and n is prime. Primes of the form 


2? —1 
are called Mersenne primes. The first few Mersenne primes are 


OF Sts PS a. Se oS 07, OY | 8191: 
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Of course, not every number 2? — 1 is prime. For example, 
2) _1 = 2047 = 23-89 and 279 — 1 = 536870911 = 233 - 1103 - 2089. 


The Mersenne primes are named after Father Marin Mersenne (1588-1648), 
who asserted in 1644 that 2? — 1 is prime for 


p= 03.5.7, 13,17, 19: 31,67,427. 257 


and that these are the only primes less than 258 for which 2? — 1 is prime. It is not 
known how Mersenne discovered these “facts,” especially since it turns out that his 
list is not correct. The complete list of primes p less than 10000 for which 2? — 1 
is prime is! 


p = 2,3,5,7, 13, 17,19, 31,61, 89, 107, 127,521, 607, 1279, 
2203, 2281, 3217, 4253, 4423, 9689, 9941. 


It is a nontrivial problem to check a large number for primality, and indeed it 
wasn’t until 1876 that E. Lucas proved conclusively that 2!27 — 1 is prime. Lu- 
cas’s 39-digit number remained the largest known prime until the 1950s, when the 
advent of electronic computing machines made it possible to check numbers with 
hundreds of digits for primality. Table 14.1 lists Mersenne primes that have been 
discovered in recent years using computers, together with the names of the peo- 
ple who made the discoveries. The largest known prime has more than 12 million 
digits! 

The most recent Mersenne primes in Table 14.1 were unearthed using special- 
ized software as part of Woltman’s Great Internet Mersenne Prime Search. You, 
too, can take part in the search for world record primes” by downloading software 
from the GIMPS website 


www.mersenne.org/prime.htm 
Further historical and topical information about Mersenne primes is available at 
www.utm.edu/research/primes/mersenne. shtml 


Of course, although it is interesting to see a list like this of the world’s largest 
known primes, there is no huge mathematical significance in finding a few more 
Mersenne primes. Far more interesting from a mathematical perspective is the 
following question. The answer is not known. 


‘Notice that Father Mersenne made five mistakes, three of omission (61, 89, 107) and two of 
commission (67, 257). 

Andy Warhol opined that in the future everyone will be famous for 15 minutes. One route to 
such fame is to find the largest known (Mersenne) prime. And the quest for bigger and better primes 
continues. 
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p Discovered by | Date | Discovered by 
521, 607 as 
: Slowinski 
1279, 2203 Robinson 1952 756839 Gage 
ma Slowinski 
327 Riesel 1957 859433 G 
4253 eis 
4423 Burwiz-- | 204 1257787 pis ‘| 1996 
aoe Gilli 6 1398269* Armengaud | 1996 
oe See ee? 2976221" Spence 
3021377" Clarkson 
eae necoman ees 6972593" | Hajratwala 
PAG V8 Enea 1978 | 13466917* Cameron 
2099601 1* Shafer 
oa xe ee 24036583" Findley | 2004 
44497 Besa: 1979 2596495 1* Nowak 2005 
— 30402457" | Boone, C 2005 
86243 Slowinski 1982 
132049 Slowinski 1983 
216091 Slowinski | 1985 : utara 
Colquitt 42643801 Strindmo 
110503 1988 43112609* Smith 
Welsch 


Table 14.1: Primes p > 500 for Which 2? — 1 Is Known to be Prime 


*Discovered with GIMPS (Woltman, Kurokowski.... ) 


Question 14.2. Are there infinitely many Mersenne primes, or does the list of 
Mersenne primes eventually stop? 


Exercises 


14.1. If a” + 1 is prime for some numbers a > 2 and n > 1, show that n must be a power 
of 2. 


14.2.. Let. = 92" + 1. For example, Fy = 5, Fy = 17, Fs = 257, and F4 = 65537. 
Fermat thought that all the F,’s might be prime, but Euler showed in 1732 that F; factors 
as 641 - 6700417, and in 1880 Landry showed that F¢ is composite. Primes of the form 
F;, are called Fermat primes. Show that if k # m, then the numbers Fy, and F;,, have no 
common factors; that is, show that gcd(F,, Fin) = 1. [Hint. If k > m, show that F,, 
divides Fy, — 2.] 


14.3. The numbers 3” — 1 are never prime (if n > 2), since they are always even. However, 
it sometimes happens that (3 — 1) /2 is prime. For example, (3° —1)/2 = 13 is prime. 
(a) Find another prime of the form (3” — 1)/2. 
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(b) If n is even, show that (3” — 1)/2 is always divisible by 4, so it can never be prime. 

(c) Use a similar argument to show that if n is a multiple of 5 then (3” — 1)/2 is never a 
prime. 

(d) Do you think that there are infinitely many primes of the form (3” — 1)/2? 


Chapter 15 


Mersenne Primes and Perfect 
Numbers 


The ancient Greeks observed that the number 6 has a surprising property. If you 
take the proper divisors of 6, that is, the divisors other than 6 itself, and add them 
up, you get back the number 6. Thus, the proper divisors of 6 are 1, 2, and 3, and 
when you add these divisors, you get 


14+2+3=6. 


This property is rather rare, as can be seen by looking at a few examples: 


n Sum of Proper Divisors of n 

6 bo2 oo =O Sum is just right (perfect!). 
10 1+2+5=8 Sum is too small. 
12 1+2+3+44+6=16 Sum is too large. 
15 Lo o=9 Sum is too small. 
20 Le2 44-54 10 = 22 Sum is too large. 
28 1424+4+7+14= 28 Sum is just right (perfect!). 
45 1+34+54+9+415 = 33 Sum is too small. 


The Greeks called these special numbers perfect. That is, a perfect number is a 
number that is equal to the sum of its proper divisors. So far, we have discovered 
two perfect numbers, 6 and 28. Are there others? 

The Greeks knew a method for finding some perfect numbers and, interestingly 
enough, their method is closely related to the Mersenne primes that we studied in 
the previous chapter. The following assertion occurs as Proposition 36 of Book IX 
of Euclid’s Elements. 
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Theorem 15.1 (Euclid’s Perfect Number Formula). Jf 2? — 1 is a prime number, 
then 2?-1(2P — 1) is a perfect number. 


The first two Mersenne primes are 3 = 2? — 1 and 7 = 2? — 1. Euclid’s Per- 
fect Number Formula applied to these two Mersenne primes gives the two perfect 
numbers we already know, 


22-107 --1)=6 and  27-1(23? 1) =28. 


The next Mersenne prime is 2° — 1 = 31, and Euclid’s formula gives us a new 
perfect number, 
2°-1(2° — 1) = 496. 


To check that 496 is perfect, we need to sum its proper divisors. Factoring 496 = 
24 . 31, we see that the proper divisors of 496 are 


20 O98 cards 2h 2.310 alae eal 


We could just add these numbers, but to illustrate the general method we will sum 
them in two stages. First 


1D 204 98 tS 
and second 
SAO ST 0-0 B19 SS es 0? 98) =a 5. 


Now adding the two pieces gives 31 + 31-15 = 31-16 = 496, so 496 is indeed 
perfect. 

Using the same sort of idea, we can easily verify that Euclid’s Perfect Number 
Formula is true in general. We let ¢ = 2? — 1, and we need to check that 2?—'q is 
a perfect number. The proper divisors of 2?~1g are 


(oA OP! cand: ty ubdG. a Ano ote: 


We add these numbers using the formula for the Geometric Series on page 96. The 
Geometric Series formula (slightly rearranged) says that 


n 
—1 
lta pa pee pate 
eae | 
Putting x = 2 and n = p, we get 
2? — 1 
PLO ate SS Sr ae: 
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And we can use the formula with xz = 2 and n = p — 1 to compute 


q+ 2qt4q+-++ +2? %¢=G(14+2+44---+4+2?-7) 
gp-1_ 1 
=a( 2-1 
= g(a 
So if we add all the proper divisors of 2?~'q, we get 
1+ 24+44-+-4 20-4 q42q4+4qt--- +20 %q = q4q(2? 1-1) = PG. 


This shows that 2?~+g is a perfect number. 

We can use Euclid’s Perfect Number Formula to write down many more perfect 
numbers. In fact, we get one perfect number for each Mersenne prime that we 
can find. The first few perfect numbers obtained in this fashion are listed in the 
following table. As you will observe, the numbers get large rather quickly. 


as ES ees 


| aP-l(aP — 1) |] 6 | 28 | 496 | 8128 8589869056 


We can also list perfect numbers that are incredibly huge. For example, 


Deer (oteers: Bae 1p) pErAee (greece =a 1) 


and 
are perfect numbers. The latter has more than half a million digits! 

A natural question to ask at this point is whether Euclid’s Perfect Number For- 
mula actually describes all perfect numbers. In other words, does every perfect 
number look like 2?~1(2? — 1) with 2? — 1 prime, or are there other perfect num- 
bers? Approximately 2000 years after Euclid’s death, Leonhard Euler showed that 
Euclid’s formula at least gives all even perfect numbers. 


Theorem 15.2 (Euler’s Perfect Number Theorem). Jf n is an even perfect number, 


then n looks like 
n = 2P-1(2P — 1), 


where 2? — 1 is a Mersenne prime. 


We will prove Euler’s theorem at the end of this chapter, but first we need to 
discuss a function that will be needed for the proof. This function, which is denoted 
by the Greek letter o (sigma), is equal to 


o(n) = sum of all divisors of n (including 1 and 7). 
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Here are a few examples: 


o(6)=1+24+3+4+6 = 12 
o(8)=14+2+4+8 = 15 
o(18) =14+2+3+464+94+18=39. 


We can also give some general formulas. For example, if p is a prime number, then 
its only divisors are 1 and p, so o(p) = p+ 1. More generally, the divisors of a 
prime power p* are the numbers 1, p, p?,..., p*, so 


pe+1 a | 


o(p") =1+ptp?+---+p* = pal 


To study the sigma function further, we make a short table of its values. 


al} a2) 3 a(3)=4 WAST o(5) =6 

ao) =12 at) = o(8) = 15 (9) = 1% -o(10)=18 
o(11)=12 o(12)=28 of13)=14 o(14)=24 (15) = 24 
e116) =a) ~o(lf S18 o(18) = 39 -o (19) = 20 . o(20) = 42 
o(21)=32 o(22)=36 o(23)=24 o0(24)=60 (25) =31 
a(26)=42° 6(27)=A0 -@i28)\= 56 26(29). = 30 <9(30)-= 72 
a(sl) = 32). -6(82) = 63 o(33)=48. o(384)=54" soGah=48 
o(36)=91 (37) =38 o(38)=60 0(39)=56 (40) =90 
o(41)=42 0(42)=96 o(43)=44 o(44)=84 0(45) = 78 
o(46)=72 o(47)=48 o(48)=124 0(49)=57 0(50) = 93 
Goly= 72" oto2) = 98"! O53) = 54" 6 (94)'= 120" 0 (55). = 72 
o(56)=120 o(57)=80 o(58)=90 o(59)=60 (60) = 168 
o(61)=62 o(62)=96 «o(63)=104 o(64)=127 o(65) = 84 


An examination of this table reveals that o(mn) is frequently equal to the product 
o(m)a(n) and, after a little further analysis, we notice that this seems to be true 
when m and 7 are relatively prime. Thus, the sigma function appears to obey the 
same sort of multiplication formula as the phi function that we studied in Chap- 
ter 11. We record this rule, together with the formula for a(p*). 


Theorem 15.3 (Sigma Function Formulas). (a) [fp is a prime and k > 1, then 


k 2 k pire 
Op) = lp pe ep Se 


(b) If gcd(m,n) = 1, then 


a(mn) = o(m)a(n). 
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Just as with the phi function, we can use the sigma function formulas to easily 
compute o(n) for large values of n. For example, 


a( 16072) S a2? = 77>-41) 
= o(2) -o(7?) - (41) 
S(O PO hae se 7 eee a) 
=15 5742 = 35010, 


and 
o(800000) = o(2° - 5°) 


a) BOT 
art tL 
15624 
=511- or esis 1995966. 


At this point you probably expect that I will show you how to prove the multi- 
plication formula for the sigma function. But I won’t! You have now made enough 
progress in number theory that it is time for you to start acting as a mathematician 
yourself.! So I am going to ask you to prove the formula o(mn) = o(m)a(n) 
for relatively prime integers m and n. Don’t be discouraged and give up if you 
don’t succeed at first. One suggestion I can give you is to try to discover why the 
formula is true before you attempt to give a general proof. So, for example, first 
look at numbers like 21 = 3-7 and 65 = 5-13 that are products of two primes and 
list their divisors. This should enable you to prove that o(pq) = o(p)o(q) when p 
and gq are distinct prime numbers. Then try some m’s and n’s that have two or three 
divisors each and try to see how the divisors of m and n fit together to give divisors 
of mn. If you can describe this precisely enough, you should be able to prove that 
o(mn) = ao(m)a(n). Remember, though, that you’ll need to use the fact that m 
and n are relatively prime. 

How is the sigma function related to perfect numbers? A number n is perfect if 
the sum of its divisors, other than n itself, is equal to n. The sigma function o(7n) 
is the sum of the divisors of n, including n, so it has an “extra” n. Therefore, 


n is perfect exactly when o(n) = 2n. 


We are now ready to prove Euler’s formula for even perfect numbers, which we 
restate here for your convenience. 


"Your mission, should you decide to accept it, is to prove the multiplication formula for the sigma 
function. Should you be captured or killed in this endeavor, we will be forced to deny all knowledge 
of your activities. Good luck! 
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Theorem 15.4 (Euler’s Perfect Number Theorem). Jfn is an even perfect number, 
then n looks like 
fie DP (2? 1): 


where 2? — 1 is a Mersenne prime. 


Proof. Suppose that n is an even perfect number. The fact that n is even means 
that we can factor it as 


n=2*m — with k > 1 and m odd. 
Next we use the sigma function formulas to compute o(n), 
a(n) = 0(2*m) since n = 2*m, 


= o(2")o(m) using the multiplication formula for o 
and the fact that gcd(2*,m) = 1, 


= (2**1 _ 1)0(m) using the formula for o(p") with p = 2. 


But n is supposed to be perfect, which means that a(n) = 2n = 2*+1m. So we 
have two different expressions for o(n), and they must be equal, 


ott = (2° — 1)6e(m). 


The number 2*+! — 1 is clearly odd, and (2*+! — 1)a(m) is a multiple of 
2*+1 so 2*+1 must divide o(m). In other words, there is some number c such that 
a(m) = 2*+1¢. We can substitute this into the above equation to get 


gktly, — Og = 1)o(m) a (ors -s 12", 


and then canceling 2+! from both sides gives m = (2**+1 — 1)ec. To recapitulate, 
we have shown that there is an integer c such that 


m=(2F*1~1)e and =a (m) = 2**1¢. 


We are going to show that c = 1 by assuming that c > 1 and deriving a false 
statement. (This is called a “proof by contradiction.) So suppose that c > 1. Then 
m = (2**! _ 1)c would be divisible by the distinct numbers 


des La SANG): 77s 


(N.B. The fact that our original number n was even means that k > 1, soc and m 
are different.) Of course, m is probably divisible by many other numbers, but in 
any case we find that 


o(m) >1lt+etm=14+et (2**1-1)e=142"*1¢. 
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However, we also know that ¢(m) = 2**1¢, so 
jsp ieee ar eae oe 


Therefore, 0 > 1, which is an absurdity. This contradiction shows that c must 
actually be equal to 1, which means that 


m = (2**1 — 1) and o(m) =2"*! —=m+1. 


Which numbers m have the property that o(m) = m+ 1? These are clearly 
the numbers whose only divisors are 1 and m, since otherwise the sum of their 
divisors would be larger. In other words, o(m) = m+ 1 exactly when m is prime. 
We have now proved that if n is an even perfect number then 


n = 2*(2"*1 1) with 2*+1 — 1 a prime number. 


We know from Chapter 14 that if 2*+! — 1 is prime then k + 1 must itself be prime, 
say k + 1 = p. So every even perfect number looks like n = 2?~1(2? — 1) with 
2? — 1 a Mersenne prime. This completes our proof of Euler’s Perfect Number 
Theorem. a 


Euler’s Perfect Number Theorem gives an excellent description of all even per- 
fect numbers, but it says nothing about odd perfect numbers. 


Question 15.5 (Odd Perfect Number Quandary). Are there any odd perfect num- 
bers? 


To this day, no one has been able to discover any odd perfect numbers, although 
this is not through lack of trying. Many mathematicians have written many research 
papers (more than 50 papers in the last 50 years) studying these elusive creatures, 
and it is currently known that there are no odd perfect numbers less than 10°. 
However, no one has yet been able to prove conclusively that none exist, so for 
now, odd perfect numbers are like the little man in the poem: 


Last night I met upon the stair, 
A little man who wasn’t there. 
He wasn’t there again today. 
I wish to heck he’d go away. 


Anonymous 
If you do some experimentation with small numbers, you might suspect that 


a(n) < 2n for all odd numbers. If this were true, it would certainly prove that 
there are no odd perfect numbers, but unfortunately it is not true. The first odd 
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number for which it is false isn = 945 = 3° - 5-7, which has o(945) = 1920. 
This example should serve as a warning against believing a fact to be true simply 
because it has been checked for lots of small numbers. It is perfectly all right to 
make conjectures based on numerical data, but mathematicians insist on rigorous 
proofs precisely because such data can be misleading. 


Exercises 


15.1. If m and n are integers with gcd(m, n) = 1, prove that o(mn) = o(m)a(n). 


15.2. Compute the following values of the sigma function. 
(a) o(10) (b) o(20) (c) 0 (1728) 


15.3. (a) Show that a power of 3 can never be a perfect number. 

(b) More generally, if p is an odd prime, show that a power p* can never be a perfect 
number. 

(c) Show that a number of the form 3° - 57 can never be a perfect number. 

(d) More generally, if p is an odd prime number greater than 3, show that the product 
3p? can never be a perfect number. 

(e) Even more generally, show that if p and q are distinct odd primes, then a number of 
the form q'p’ can never be a perfect number. 


15.4. Show that a number of the form 3” - 5” - 7* can never be a perfect number. 


15.5. Prove that a square number can never be a perfect number. [Hint. Compute the value 
of a(n”) for the first few values of n. Are the values odd or even?] 


15.6. A perfect number is equal to the sum of its divisors (other than itself). If we look at 
the product instead of the sum, we could say that a number is product perfect if the product 
of all its divisors (other than itself) is equal to the original number. For example, 


m Product of factors 

6 L260 product perfect 

9 bess product is too small 
12 1-2-3-4-6=144 product is too large 
15 L329 = 15 product perfect. 


So 6 and 15 are product perfect, while 9 and 12 are not product perfect. 

(a) List all product perfect numbers between 2 and 50. 

(b) Describe all product perfect numbers. Your description should be precise enough to 
enable you easily to solve problems such as “Is 35710 product perfect?” and “Find a 
product perfect number larger than 10000.” 

(c) Prove that your description in (b) is correct. 


15.7. == (a) Write a program to compute o(n), the sum of all the divisors of n (in- 
cluding 1 and n itself). You should compute o(n) by using a factorization of n into 
primes, not by actually finding all the divisors of n and adding them up. 
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(b) As you know, the Greeks called n perfect if a(n) = 2n. They also called n abundant 
if a(n) > 2n, and they called n deficient if a(n) < 2n. Count how many n’s 
between 2 and 100 are perfect, abundant, and deficient. Clearly, perfect numbers are 
very rare. Which do you think are more common, abundant numbers or deficient 
numbers? Extend your list for 100 < n < 200 and see if your guess still holds. 


15.8. The Greeks called two numbers m and n an amicable pair if the sum of the proper 
divisors of m equals n and simultaneously the sum of the proper divisors of n equals m. 
(The proper divisors of a number 7 are all divisors of n excluding n itself.) The first 
amicable pair, and the only one (as far as we know) that was known in ancient Greece, is 
the pair (220, 284). This pair is amicable since 


284 =14+24+4+5+4+104114 204 224+ 44+55+4 110 (divisors of 220) 
220 =14+24+4+4+71+142 (divisors of 284). 


(a) Show that m and n form an amicable pair if and only if o(n) and o(m) both equal 
n+m. 
(b) Verify that each of the following pairs is an amicable pair of numbers. 


(220, 284), (1184, 1210), (2620, 2924), (5020, 5564), (6232, 6368), 
(10744, 10856), (12285, 14595). 


(c) There is a rule for generating amicable numbers, although it does not generate all of 
them. This rule was first discovered by Abu-l-Hasan Thabit ben Korrah around the 
ninth century and later rediscovered by many others, including Fermat and Descartes. 
The rule says to look at the three numbers 


p82 a 
GS 2p S342 
r=(p+1)(¢+1)-1=9-27%" -1. 


If all of p, g, and r happen to be odd primes, then m = 2°%pq and n = 2°r are 
amicable. Prove that the method of Thabit ben Korrah gives amicable pairs. 

Taking e = 2 in Thabit ben Korrah’s method gives the pair (220,284). Use his 
method to find a second pair. If you have access to a computer that will do factor- 
izations for you, try to use Thabit ben Korrah’s method to find additional amicable 
pairs. 


(d 


— 


15.9. = Let 
s(n) = o(n) — n = sum of proper divisors of n; 


that is, s(n) is equal to the sum of all divisors of n other than n itself. So n is perfect if 
s(n) = n, and (m,n) are an amicable pair if s(m) = n and s(n) = m. More generally, a 
collection of numbers 71, 2,..., Nz 18 called sociable (of order t) if 


s(m1)=N2, s(n2)=n3, .--, S(M-1)=M, 8(Mt) = 1. 


[Chap. 15] Mersenne Primes and Perfect Numbers 110 


(An older name for a list of this sort is an Aliquot cycle.) For example, the numbers 


14316, 19116, 31704, 47616, 83328, 177792, 295488, 

629072, 589786, 294896, 358336, 418904, 366556, 274924, 

275444, 243760, 376736, 381028, 285778, 152990, 122410, 
97946, 48976, 45946, 22976, 22744, 19916, 17716 


are a sociable collection of numbers of order 28. 

(a) There is one other collection of sociable numbers that contains a number smaller than 
16000. It has order 5. Find these five numbers. 

(b) Up until 1970, the only known collections of sociable numbers of order at least 3 
were these two examples of order 5 and 28. The next such collection has order 4, and 
its smallest member is larger than 1,000,000. Find it. 

(c) Find a sociable collection of order 9 whose smallest member is larger than 


800,000,000. 


This is the only known example of order 9. 
(d) Find a sociable collection of order 6 whose smallest member is larger than 


90,000,000,000. 


There are two known examples of order 6; this is the smallest. 


Chapter 16 


Powers Modulo mand 
Successive Squaring 


How would you compute 
5,100000000000000 (mod 12830603)? 


If 12830603 were prime, you might try using Fermat’s Little Theorem (Chapter 9), 
and even if it is not prime, Euler’s Formula (Chapter 10) is available. In fact, it 
turns out that 12830603 = 3571 - 3593 and 


(12830603) = $(3571)¢6(3593) = 3570 - 3592 = 12823440. 
Euler’s Formula tells us that 
ger a1 (mod m) foranyaandmwith gcd(a,m) =1, 
so we can use the fact that 
100000000000000 = 7798219 - 12823440 + 6546640 


to “simplify” our problem, 
1000000000000 __ (5,12823440)7798219 , 56546640 


= 5094664 (mod 12830603). 


Now we “only” have to compute the 6546640" power of 5 and then reduce it 
modulo 12830603. Unfortunately, the number 5°°4°°49 has more than 4 million 
digits, so it would be difficult to calculate even with a computer. And later we will 
want to compute a* (mod m) for numbers a, k, and m having hundreds of digits, 
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in which case the number of digits in a” is larger than the number of subatomic 
particles in the known universe! We need to find a better method. 

You may well be asking why anyone would want to compute such large powers. 
Aside from the intrinsic interest (if any) of being able to perform computations with 
large numbers,! there is a very practical reason. As we will see later, it is possible to 
use the computation of a* (mod m) to encode and decode messages. Amazingly 
enough, the resulting codes are so good that they are unbreakable by even the most 
sophisticated code-breaking techniques currently known. Having thus piqued your 
curiosity, we will spend the remainder of this chapter and the next discussing how 
to compute large powers and large roots modulo m. Then in Chapter 18 we will 
explain how to use such computations to create “unbreakable” codes. 

The clever idea used to compute a* (mod m) is called the Method of Succes- 
sive Squaring. Before describing the method in general, we illustrate it by comput- 
ing 

7°27 (mod 853). 


The first step is to create a table giving the values of 7,77,74,7°,7/°,... mod- 
ulo 853. Notice that to get each successive entry in the list, we merely need to 
square the previous number. Furthermore, since we always reduce modulo 853 
before squaring, we never have to work with any numbers larger than 852”. Here’s 


the table of 2*-powers of 7 modulo 853. 


? =7 = 
(yn 77 o=40) S10 
=(77)° =49? =2401 = 695 
(74)* = 6952 = 483025 = 227 (mod 853 
7 = (78)’ 
( 6 
( 
= 


2 


= (716\? = 


732)" = 6752 = 455625 = 123 (mod 853 
764)? = 123? = 15129 = 628 (mod 853 


764 
7128 — 


The next step is to write the exponent 327 as a sum of powers of 2. This is called 
the binary expansion of 327. The largest power of 2 less than 327 is 2° = 256, so 
we write 327 = 256 + 71. Then the largest power of 2 less than 71 is 2° = 64, so 


‘Question from a fourth grader: “What do mathematicians do, anyway, multiply really big num- 
bers?” 
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327 = 256 + 64+ 7. And so on: 


327 = 256+ 71 
= 256+ 64+ 7 
= 256+ 644+ 4+3 
= 256+ 64+4+2+1. 


Now we use the binary expansion of 327 to compute 


7327 = 7256464444241 


— 7256 | 764 74.72 71 
= 298 - 123 - 695 - 49-7 (mod 853). 


The numbers in the last line are taken from the table of powers of 7 that we com- 
puted earlier. 

To complete the computation of 732” (mod 853), we just need to multiply the 
five numbers 298 - 123 - 695 - 49 - 7 and reduce them modulo 853. And if the prod- 
uct of all five numbers is too large for our taste, we can just multiply the first two, 
reduce modulo 853, multiply by the third, reduce modulo 853, and so on. In this 
way, we still never need to work with any number larger than 8527. Thus, 


298 - 123 - 695 - 49-7 = 828 -695- 49-7 = 538 - 49-7 
= 772-7 = 286 (mod 853). 


We’re done! 
7327 = 286 (mod 853). 


This may seem like a lot of work, but suppose that instead we try to compute 
7°27 (mod 853) directly by first computing 722” and then dividing by 853 and 
taking the remainder. It is possible to do this with a small computer, since 


7227 — 29936123868955180582 ........... 32584937995509879543 
VSS 
237 digits omitted 


= 286 (mod 853), 


but, as you can see, the numbers get quite large. And it is completely infeasible to 
compute a* exactly when k has, say, 20 digits, much less when k has the hundreds 
of digits required for the construction of secure codes. 

On the other hand, the method of successive squaring can be used to compute 
a’ (mod m) even when k has hundreds or thousands of digits, because a careful 
analysis of the method shows that it takes approximately log.(k) steps to compute 
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a® (mod m). We will not perform this analysis here but will observe that logs(k) 
is more-or-less 3.322 times the number of digits in k. So if k has, say, 1000 digits, 
then it takes approximately 3322 steps to compute a* (mod m). Admittedly, this is 
a lot of steps to do by hand, but it is the work of an instant on even a small desktop 
computer. To give you an idea of the times involved, my laptop computer (with a 
1500-MHz Pentium chip, for those who are technically inclined) used successive 
squaring to compute 


200,000 
710 


= 787 (mod 853) in 0.36 seconds and 
= 303 (mod 853) in 4.48 seconds. 


lOreeMiers 
We now describe the general method of computing powers by successive squar- 
ing. 


Algorithm 16.1 (Successive Squaring to Compute a* (mod m)). The following 
steps compute the value of a* (mod m): 


1. Write k as a sum of powers of 2, 
k=ugtuy-2+ug-4+ug-8+---+ up: 2", 
where each uj is either 0 or 1. (This is called the binary expansion of k.) 


2. Make a table of powers of a modulo m using successive squaring. 


al = Ag (mod m) 
a= (a')’ = A? =A, (mod m) 
a= (a*) = A? = Ao (mod m) 
a® = (a*) = AZ = As; (mod m) 


Note that to compute each line of the table you only need to take the number 
at the end of the previous line, square it, and then reduce it modulo m. Also 
note that the table has r+ 1 lines, where r is the highest exponent of 2 
appearing in the binary expansion of k in Step 1. 


3. The product 
Ag® - Ay! - Aj? --- A?” (mod m) 


will be congruent to a® (mod m). Note that all the u;’s are either 0 or 1, so 
this number is really the product of those A;’s for which wu, equals 1. 
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Proof. Why does it work? We compute 


ak _ quotu -24+u2:4+u3:8+--+up-2" 


pa ae ; (a7)! ’ (a*)%2 nee (a> )™ 


= Aj°- Ay. A5?--- Az" (mod m) using the table from Step2. 0 


using Step 1, 


As mentioned earlier, computing large powers a* (mod m) has a real-world 
use in creating secure codes. To create these codes, it is necessary to find a few 
large primes, say primes with between 100 and 200 digits. This brings up the 
question of how to check whether or not a given number m™ is prime. A surefire 
but inefficient method is to try dividing by each number up to ,/m and see if you 
find any factors. If not, then m is prime. Unfortunately, this method is not practical 
even for m’s of moderate size. 

Using successive squaring and Fermat’s Little Theorem (Chapter 9), we can 
often show that a number m is composite without finding any factors at all! Here’s 
how. Take any number a less than m. First compute gcd(a,m). If it is greater 
than 1, then you’ve found a factor of m, so m is composite and you’re done. On 
the other hand, if gcd(a,m) = 1, use successive squaring to compute 


m—1 ( 


a mod m). 


Fermat’s Little Theorem says that if m is prime then the answer will be 1; so if 
the answer turns out to be anything other than 1, you know that m is composite 
without actually knowing any factors. 

Here’s an example. Using successive squaring we compute 


2283976710803262 — 980196559097287 (mod 2839767 10803263), 


so we know that 283976710803263 is definitely not a prime. In fact, its prime 
factorization is 


283976710803263 = 104623 - 90437 - 30013. 
Now consider m = 630249099481. Using successive squaring, we find that 


goavatngese0' = 1 (mod 630249099481) 
and 
3630249099480 — 1 (mod 630249099481). 
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Does this mean that 630249099481 is prime? Not necessarily, but it certainly 
makes it likely. And if we check a~! (mod m) for a = 5, 7, and 11 and 
again get 1 (which we do), then we would become even more convinced that 
630249099481 is prime. Using Fermat’s Little Theorem in this way, it is never 
possible to prove conclusively that a number is prime; but if a”~! = 1 (mod m) 
for a lot of a’s, then we would certainly suspect that m is indeed a prime. This is 
how Fermat’s Little Theorem and successive squaring can be used to prove that 
certain numbers are composite and to strongly suggest that certain other num- 
bers are prime. Unfortunately, there do exist composite numbers m such that 
a™-! = 1 (mod m) for all a’s with gcd(a,m) = 1. Such m’s are called Car- 
michael numbers. The smallest Carmichael number is 561, as you verified in Ex- 
ercise 10.3. We investigate Carmichael numbers and primality testing further in 
Chapter 19. 


Exercises 


16.1. Use the method of successive squaring to compute each of the following powers. 
(a) 513 (mod 23) (b) 28749 (mod 1147) 


16.2. == The method of successive squaring described in the text allows you to compute 
a*® (mod m) quite efficiently, but it does involve creating a table of powers of a mod- 

ulo m. 
(a) Show that the following algorithm will also compute the value of a* (mod m). It is 
a more efficient way to do successive squaring, well-suited for implementation on a 


computer. 
(1) Set b=1 
(2) Loop while k>1 
(3) If k is odd, set b=a-b(modm) 
(4) Set a=a? (modm). 
(5) Set k=k/2 (round down if k is odd) 


(6) End of Loop 
(7) Return the value of b (which equals a* (mod m)) 


(b) Implement the above algorithm on a computer using the computer language of your 
choice. 
(c) Use your program to compute the following quantities: 
G) 21999 Gnod 2379). ..Gi) 5671224 (med 4321) Git) 47798008 (mod: 1315171) 


16.3. (a) Compute 7°°°° (mod 7387) by the method of successive squaring. Is 7387 
prime? 
(b) Compute 77392 (mod 7393) by the method of successive squaring. Is 7393 prime? 
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16.4. == Write a program to check if a number n is composite or probably prime as 
follows. Choose 10 random numbers aj, a@2,...,@19 between 2 and n— 1 and com- 
pute as mod n for each a;. If a # 1 (mod n) for any a;, return the message “n 
is composite.” If a = 1 (mod n) for all the a,’s, return the message “n is 
probably prime.” 

Incorporate this program into your factorization program (Exercise 7.7) as a way to 
check when a large number is prime. 


16.5. Compute 2°°°° (mod 9991) by successive squaring and use your answer to say 
whether you believe that 9991 is prime. 


Chapter 17 


Computing k'" Roots 
Modulo m 


In the last chapter we learned how to compute k“ powers modulo m when k and m 
are very large. Now we will travel in the opposite direction and try to compute k"® 
roots modulo m. In other words, suppose we are given a number 0 and told to find 
a solution to the congruence 


a* = b (mod m). 
We could try substituting « = 0,1, 2,... until we find a solution, but if m is large, 
this could take a long time. It turns out that if we know the value of ¢(m) then we 
can compute the k" root of b modulo m fairly easily. As usual, we first illustrate 
the method with an example. 

We are going to solve the congruence 


a3! = 758 (mod 1073). 


The first step is to compute $(1073). We can do this using the formulas for @ in 
Chapter 11 as soon as we factor 1073 into a product of primes. This is easily done; 
1073 = 29 - 37, so (1073): = 0(29)0(37) = 28-36 = 1008. 

The next step is to find a solution in (positive) integers to the equation 


ku — o(m)v = 1; that is, to the equation 131u — 1008v = 1. 
We know that a solution exists, since for our example 


gcd(k, 6(m)) = ged(131, 1008) = 1, 
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and the method described in Chapter 6 allows us to find the solution u = 731 and 
uv = 95. More precisely, the method in Chapter 6 gives the solution 


131 - (—277) + 1008 - 36 = 1. 
To get positive values for u and v, we modify this solution, 
u = —2774+ 1008 = 731 and v= —36+4 131 =95. 


The equation 
131 - 731 — 1008-95 = 1 


provides the key to solving the original problem. 
We take x!*! and raise it to the u'" power, that is, to the 731‘ power. Notice 


that 


731 : : 95 
(e-) a4 it Tale git 1008 95 o 08) 


But 1008 = ¢(1073), and Euler’s formula (Chapter 10) tells us that 
1008 — 1 (mod 1073). 


This means that (x!3!)’** = x (mod 1073). So if we raise both sides of the 
congruence x13! = 758 (mod 1073) to the 731% power, we get 


x = (231)! = 758731 (mod 1073). 


Now we need merely use the method of successive squares (Chapter 16) to compute 
the number 7587?! (mod 1073). The answer we arrive at is z = 905 (mod 1073). 
Finally, as a check, we can use successive squaring to verify that 9051%! is indeed 
congruent to 758 modulo 1073. 

Here, then, is the general method of computing roots modulo m. 


Algorithm 17.1 (How to Compute k'® Roots Modulo m). Let b, k, and m be given 
integers that satisfy 


gcd(b,m) = 1 and gcd(k, $(m)) = 1. 
The following steps give a solution to the congruence 
a* = b (mod m). 
1. Compute ¢(m). (See Chapter 11.) 


2. Find positive integers u and v that satisfy ku — ¢(m)v = 1. [See Chap- 
ter 6. Another way to say this is that u is a positive integer satisfying 
ku = 1 (mod ¢(m)), so u is actually the inverse of k modulo ¢(m).] 
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3. Compute b“ (mod m) by successive squaring. (See Chapter 16.) The value 
obtained gives the solution x. 


Why Does It Work? We need to check that x = 6” is a solution to the congruence 
z* = b (mod m). 


ak = (b%)* substituting z = bY into x*, 
= puk 
ay ae Au since ku — ¢(m)v = 1 from Step 2, 
= b- (e(™))” 


=b(modm) _ since b?") = 1 (mod m) from Euler’s 
formula (Chapter 10). 


This completes the proof that x = b” provides the desired solution to the congru- 
ence «* = b (mod m). O 


The successive squaring method described in Chapter 16 is a completely prac- 
tical way to compute powers a* (mod m), even for very large numbers k and m. 
Is our method for finding k roots modulo m equally practical? In other words, 
how difficult is it, in practice, to solve x* = b (mod m)? We’ll consider the three 
steps in reverse order. Step 3 says to compute b” (mod m) by successive squaring, 
so it causes no problem. Step 2 asks us to solve ku — ¢(m)v = 1. The method 
described in Chapter 6 for solving such equations is also quite practical, even for 
large values of k and ¢(m), since it is based on the Euclidean algorithm. 

Finally, we come to the innocuous-looking Step 1, which says to find the value 
of ¢(m). If we know the factorization of m into primes, then it is easy to com- 
pute ¢(m) using the formulas in Chapter 11. However, if m is very large, it may 
be extremely difficult, if not impossible, to factor m in any reasonable amount of 
time. For example, suppose that you are asked to solve the congruence 


3968039 = 34781 (mod 27040397). 


If you didn’t have a computer, it might take you quite a while to discover that 
27040397 factors as a product of two primes, 27040397 = 4409 - 6133, so 


(27040397) = 4408 - 6132 = 27029856. 
Having computed ¢(m), we can do Step 2, 
3968039 - 17881559 — 27029856 - 2625050 = 1, 
and then Step 3, 
a = 3478117881599 — 99999896 (mod 27040397), 
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to find the solution. 

Now imagine that rather than choosing an m with only 8 digits, I had instead 
taken two primes p and q, each of which has 100 digits, and set m = pq. Then it 
would be virtually impossible for you to solve x* = b (mod m) unless I were to 
tell you the values of p and q, since if you don’t know the values of p and q, then 
you won’t be able to find the value of ¢(m). 


In summary, this chapter contains an efficient and practical method to solve 
a* = b (mod m) 


provided that we are able to calculate ¢(m). It may seem unfortunate that the 
method does not work if we cannot calculate ¢(m), but it is exactly this “weakness” 
that is exploited in the next chapter to construct extremely secure codes. 


Exercises 


17.1. Solve the congruence x°2° = 452 (mod 1147). [Hint. 1147 is not prime.] 


17.2. (a) Solve the congruence x!1° = 347 (mod 463). 
(b) Solve the congruence x?’° = 139 (mod 588). 


17.3. In this chapter we described how to compute a k" root of b modulo m, but you 
may well have asked yourself if b can have more than one k" root. Indeed it can! For 
example, if a is a square root of b modulo m, then clearly —a is also a square root of b 
modulo m. 

(a) Let b, k, and m be integers that satisfy 


gcd(b,m) =1 and gcd(k, $(m)) = 1. 


Show that b has exactly one k'* root modulo m. 

(b) Suppose instead that ged(k,(m)) > 1. Show that either b has no k" roots mod- 
ulo m, or else it has at least two k™ roots modulo m. (This is a hard problem with 
the material that we have done up to this point.) 

(c) If m = pis prime, look at some examples and try to find a formula for the number of 
k'® roots of b modulo p (assuming that it has at least one). 


17.4. Our method for solving x* = b (mod m) is first to find integers u and v satisfying 
ku — ¢(m)v = 1, and then the solution is x = b” (mod m). However, we only showed 
that this works provided that gcd(b,m) = 1, since we used Euler’s formula b?°™) = 
1 (mod m). 
(a) If m is a product of distinct primes, show that x = b“ (mod m) is always a solution 
to z* = b (mod m), even if gcd(b, m) > 1. 
(b) Show that our method does not work for the congruence x° = 6 (mod 9). 
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17.5. (a) Try to use the methods in this chapter to compute the square root of 23 mod- 
ulo 1279. (The number 1279 is prime.) What goes wrong? 

(b) More generally, if p is an odd prime, explain why the methods in this chapter cannot 
be used to find square roots modulo p. We will investigate the problem of square 
roots modulo p in later chapters. 

(c) Even more generally, explain why our method for computing k™ roots modulo m 
does not work if gcd(k,d(m)) > 1. 


17.6. == Write a program to solve x* = b (mod m). Give the user the option of provid- 
ing a factorization of m to be used for computing ¢(m). 


Chapter 18 


Powers, Roots, 
and “Unbreakable” Codes 


In the last two chapters we learned how to compute powers and roots of extremely 
large numbers modulo m. Briefly, we know how to compute a* (mod m) for any 
values of a, k, and m, and we know how to solve z«* = (mod m) provided that 
we can calculate ¢(m). Here’s the basic idea that we use to encode and decode 
messages. ! 

The first step in encoding a message is to convert it into a string of numbers. 
We use the simplest possible method to do this. We set A = 11, B = 12,..., 


Z = 36. Here’s a convenient table to use: 


ATs [e [> [Fe] a] [7 [ke 
pare [a3 [1s as a6 [17 [8 [19 [20 at [aa as 


N ERE: S T/U|vVi|w/|x|/y|z| 
24 25 | 26 | 27 28 | 29 30 | 31 | 32 | 33 | 34 | 35 | 36° 


For example, the message “To be or not to be” becomes 


i [Or OB: -Be 20% (Re UN: “Ov “ie WP 1@. oBe HE 
BU) = 25> a2) 1d. 25> 28. ~ 24" 25 330e. 30%" 25> 12. IS 


‘Technically, what we describe in this chapter is a cipher, not a code, so we are really enciphering 
and deciphering messages. Historically, the word code was reserved for methods in which entire 
words and phrases are replaced by a single symbol or number, while ciphers use individual letters as 
their basic units. More recently, the word code has acquired other mathematical meanings in different 
contexts. For ease of exposition, we use the terms code and cipher interchangeably. 
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So our message is the string of digits 30251215252824253030251215. Of course, 
in some sense the message is now encoded, since this string of digits serves to 
conceal the message. But even an amateur cryptographer would be able to break 
this simple code in just a few minutes.” 

Now we are ready to explain the crux of the encoding and decoding process. 
The first thing that we do is choose two (large) prime numbers p and gq. Next we 
multiply them together to get a modulus m = pq. We can also compute ¢(m) = 
(p)o(q) = (p — 1)(q — 1), and we choose a number & that is relatively prime to 
&(m). Now we publish the numbers m and k for the whole world to know, but we 
keep the values of p and q secret. Anyone who wants to send us a message uses the 
values of m and k to encode the material in the following manner. 

First, they convert their message into a string of digits as described above. 
Next, they look at the number m and break their string of digits into numbers 
that are less than m. For example, if m is a number in the millions, they would 
write their message as a list of six-digit numbers. So now their message is a list 
of numbers a1,d2,...,@,. The next step is to use successive squaring to com- 


pute a* (mod m), af (mod m), ..., a® (mod m). These values form a new list 


coat i 
of numbers 61, b2,...,6,. This list is the encoded message. In other words, the 
message that is sent to us is the list of numbers 6), bo,..., br. 

How do we decode the message when we receive it? We have been sent the 
numbers 61, b2,...,5,, and we need to recover the numbers aj, a2,...,@,. Each 
b; is congruent to a¥ (mod m), so to find a; we need to solve the congruence 
a* = b; (modm). This is exactly the problem we solved in the last chapter, 
assuming we were able to calculate ¢(m). But we know the values of p and q 


with m = pq, so we easily compute 
o(m) = $(p)$(a) = (p— 1)(a- 1) =pq-p-qt+1l=m—-p-—q+l. 


Now we just need to apply the method used in Chapter 17 to solve each of the 
congruences g* = b; (mod m). The solutions are the numbers aj, a2,..., a, and 
then it is easy to take this string of digits and recover the original message. 

We illustrate the encoding and decoding procedure with the primes p = 12553 
and gq = 13007. We multiply them together to get the modulus m = pq = 
163276871, and we also record for future use 


o(m) = (p— 1)(q — 1) = 163251312. 


*We could have assigned a number, such as 99, to represent a space, and we could even have 
assigned numbers to represent various punctuation marks. But, to keep things simple, we ignore 
such niceties and just write our messages with all the letters squashed together. 
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We also need to choose a k that is relatively prime to ¢(m), so we take k = 79921. 
In summary, we have chosen 


p= 12553;. ¢= 13007, m= pq = 163276871, and &= 79921. 


Now suppose we want to send the message “To be or not to be.” As described 
earlier, this message becomes the string of digits 


30251215252824253030251215. 
The number ™m is 9 digits long, so we break the message up into 8-digit numbers: 
30251215, 25282425, 30302512, 15. 


Next we use the method of successive squares to raise each of these numbers to the 
k'> power modulo m. 


302512157992! = 149419241 (mod 163276871) 

25282425992! = 62721998 (mod 163276871) 

30302512799! = 118084566 (mod 163276871) 
1579921 — 40481382 (mod 163276871) 


The encoded message is the list of numbers 
149419241, 62721998, 118084566, 40481382. 


Now let’s try decoding a new message. It’s after midnight, there’s a knock at 
your door, and a mysterious messenger delivers the following cryptic missive: 


145387828, 47164891, 152020614, 27279275, 35356191. 


Without a moment of hesitation, you whip out your handy-dandy number theory 
decoding book and start to work. One number at a time, you use the methods from 
Chapter 17 to solve the congruences 


19921 = 145387828 (mod 163276871) => = =2 = 30182523 
x'9921 = 47164891 (mod 163276871) = = = 26292524 
9921 — 152020614 (mod 163276871) = 2£2=19291924 
9921 = 27279275 (mod 163276871) — = & = 30282531 
79921 — 35356191 (mod 163276871) = g£= 122215 


This gives you the string of digits 


30182523262925241929192430282531122215, 


[Chap. 18] Powers, Roots, and “Unbreakable” Codes 126 


and now you use the number-to-letter substitution table for the final decoding step. 


3013 25:23 26 29 25-24 19:29: 19 24 30:28: 25.3112:22: 15 
TUA-Oe MiP SON TS: ob NTR Oo UB LE 


Supplying the obvious word breaks and punctuation, you read 
“Thompson is in trouble” 


and off you go to the rescue. 

Is this encoding scheme secure? Suppose that you intercept a message that you 
know has been encoded with the modulus m and the exponent k. How difficult 
would it be for you to break the code and read the message? At present, the only 
way to decode is to find the value of ¢(m) and then use the decoding process just 
described. If m is the product of two primes p and q, then 


o(m) = (p— 1)(q-1) =pq—p-—q+1=m-p-—qtl. 


Since you already know the value of m, you just need to find the value of p + q. 
But if you can find p + q, then you can also determine p and gq, since they are the 
roots of the quadratic equation 


X*—(pt+qgX+m=0. 


So in order to decode the intercepted message, you essentially need to find the 
factors p and q of m. 

If m is not too large, say 5 or 10 digits, then a computer will find the factors 
almost immediately. Using more advanced methods from number theory, mathe- 
maticians have devised techniques that will factor much larger numbers, say those 
with 50 to 100 digits. So if you take primes p and q with less than 50 digits each, 
your code will not be secure. However, if you take primes with, say, 100 dig- 
its each, then no one at present will be able to decode your messages unless you 
reveal to them your values of p and q. Of course, it is possible that future mathe- 
matical advances will enable people to factor 200-digit numbers; but then you need 
merely take primes p and qg with 200 digits each, and your 400-digit modulus m 
will again render your messages secure. The idea underlying the encoding scheme 
is thus a very simple one: It is easy to multiply large numbers together, but it is 
difficult to factor a large number. 

The cryptographic method described in this chapter is called a public key cryp- 
tosystem. This name reflects the fact that the encoding key consisting of the mod- 
ulus m and the exponent k can be distributed to the public while the decoding 
method remains secure. This idea, that it might be possible to have a code where 
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knowledge of the encoding process does not enable one to decode messages, was 
propounded by Whitfield Diffie and Martin Hellman in 1976. Diffie and Hellman 
gave a theoretical description of how such a public key cryptosystem might work, 
and the following year Ron Rivest, Adi Shamir, and Leonard Adleman described 
a practical public key cryptosystem. Their idea, which we have described in this 
chapter, is called the RSA public key cryptosystem in honor of its three inventors. 


Exercises 


18.1. Decode the following message, which was sent using the modulus m = 7081 and 
the exponent / = 1789. (Note that you will first need to factor m.) 


5192, 2604, 4222 


18.2. It may appear that RSA decryption does not work if you are unlucky enough to 
choose a message a that is not relatively prime to m. Of course, if m = pq and p and q are 
large, this is very unlikely to occur. 
(a) Show that in fact RSA decryption does work for all messages a, regardless of whether 
or not they have a factor in common with m. 
(b) More generally, show that RSA decryption works for all messages a as long as ™ is 
a product of distinct primes. 
(c) Give an example with m = 18 and a = 3 where RSA decryption does not work. 
[Remember, & must be chosen relatively prime to ¢(m) = 6.] 


18.3. Write a short report on one or more of the following topics. 
(a) The history of public key cryptography 
(b) The RSA public key cryptosystem 
(c) Public key digital signatures 
(d) The political and social consequences of the availability of inexpensive unbreakable 
codes and the government’s response 


18.4. & Here are two longer messages to decode if you like to use computers. 
(a) You have been sent the following message: 


5272281348, 21089283929, 3117723025, 26844144908, 22890519533, 
26945939925, 27395704341, 2253724391, 1481682985, 2163791130, 
13583590307, 5838404872, 12165330281, 28372578777, 7536755222. 


It has been encoded using p = 187963, q = 163841, m = pq = 30796045883, and 
k = 48611. Decode the message. 

(b) You intercept the following message, which you know has been encoded using the 
modulus m = 956331992007843552652604425031376690367 and exponent k = 
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12398737. Break the code and decipher the message. 


821566670681253393182493050080875560504, 

87074173129046399720949786958511391052, 
552100909946781566365272088688468880029, 
491078995197839451033115784866534122828, 
172219665767314444215921020847762293421. 


(The material for this exercise is available on the Friendly Introduction to Number 
Theory home page listed in the Preface.) 


18.5. == Write a program to implement the RSA cryptosystem. Make your program as 
user friendly as possible. In particular, the person encoding a message should be able to 
type in their message as words, including spaces and punctuation; similarly, the decoder 
should see the message appear as words with spaces and punctuation. 


18.6. The problem of factoring large numbers has been much studied in recent years be- 
cause of its importance in cryptography. Find out about one of the following factorization 
methods and write a short description of how it works. (Information on these methods is 
available in number theory textbooks and on the web.) 

(a) Pollard’s p method (that is the Greek letter rho) 

(b) Pollard’s p — 1 method 

(c) The quadratic sieve factorization method 

(d) Lenstra’s elliptic curve factorization method 

(e) The number field sieve 
(The last two methods require advanced ideas, so you will need to learn about elliptic 
curves or number fields before you can understand them.) The number field sieve is the 
most powerful factorization method currently known. It is capable of factoring numbers of 
more than 150 digits. 


18.7. == Write a computer program implementing one of the factorization methods that 
you studied in the previous exercise, such as Pollard’s p method, Pollard’s p — 1 method, 
or the quadratic sieve. Use your program to factor the following numbers. 

(a) 47386483629775753 

(b) 1834729514979351371768185745442640443774091 


Chapter 19 


Primality Testing 
and Carmichael Numbers 


Prime numbers are the fundamental building blocks of the integers. Within the 
infinitude of prime numbers we see displayed some of the deepest and most beau- 
tiful patterns in all of number theory, and indeed in all of mathematics. And prime 
numbers, especially large prime numbers, have their practical side as well, as we 
saw when we constructed the RSA cryptosystem in Chapter 18. This leads us 
inexorably to the following question: 


How can we tell if a (large) number is prime? 
For small numbers 7 such as 
8629, 8633, and 8641, 


we can simply check all possible (prime) divisors up to ,/n, and either we find a 
divisor or we know when we’re done that n is prime. Thus we find that 8629 and 
8641 are prime numbers, but 8633 factors as 89 - 97, so it is not prime. 

For larger numbers such as 


m = 113736947625310405231177973028344375862964001 
and 
n = 113736947625310405231177973028344375862953603 


it is too much work, even with a computer, to try all possible divisors up to the 
square root. However, we saw in Chapter 16 that it is not very difficult (on a 
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computer) to raise numbers to very high powers modulo very large numbers. For 
example it takes very little time for a computer to calculate 


gm _ 9 113736947625310405231177973028344375862964001 


= 39241970815393499060120043692630615961790020 (mod m). 


At first glance, this seems like a completely useless calculation to make, but in fact 
it has tremendous practical significance. 

To explain why, we recall Fermat’s Little Theorem (Theorem 9.1), which says 
that if p is a prime number then! 


a? = a (mod p) for every integer a. 


Thus the fact that 2’ is not congruent to 2 modulo ™ tells us that m is defi- 
nitely not a prime number. We can state this unequivocally: The incongruence 
2™ 4 2 (mod m) constitutes a proof that m is not prime. It is worth reflecting for 
a moment on the surprising strength of our conclusion. We have proved that m 
is not prime, even though we do not know how to factor m; and indeed our proof 
that m is composite provides no clues? to aid us in finding a factor! The lesson to 
be learned is that it is often possible to establish that a number is composite without 
being able to factor it. 
Now consider the other number 


n = 113736947625310405231177973028344375862953603. 


If we perform a similar calculation, we find that 


gn _ 9113736947625310405231177973028344375862953603 =9 (mod n). 


Can we use Fermat’s Little Theorem to conclude that n is prime? The answer is 
absolutely not, Fermat’s Little Theorem doesn’t work in that direction. So we try a 
few more numbers, say up to a = 100, and we find that 


3° =3(modn), 4° =4(modn), 5” =5 (mod n), 
100” = 100 (mod n). 


'Theorem 9.1 actually says that a?~ = 1 (mod p) provided p + a. We have multiplied this 
version of Fermat’s Little Theorem by a in order to get a statement that is true for all values of a. 
This is a more convenient form to use in this chapter. 

The number m is the product of the following two rather large prime numbers: 

40103836670582470495139653 and 2836061511010998317. 


[Chap. 19] Primality Testing and Carmichael Numbers 131 


We still cannot use Fermat’s Little Theorem to conclude that n is prime, but the 
fact that a” = a (mod n) for 99 different values of a certainly suggests that n is 
“probably” prime. 

This is a rather odd assertion; how can a number be “probably prime”? Either 
it is a prime or else it isn’t a prime; it can’t be prime on Tuesdays and Thursdays 
and composite the rest of the week.’ 

Suppose that we think of the number n as a natural phenomenon and we study n 
in the spirit of an experimental scientist. We perform experiments by choosing 
different values for the number a and computing the value of 


a” (mod n). 


If even a single experiment results in any number other than a, we conclude that n 
is definitely composite. So it is reasonable to believe that each time we perform an 
experiment and do obtain the value a we have gathered some “evidence” that n is 
prime. 

We can put this reasoning on a firm footing by looking at those values of a 
whose n power is different from a. We say that the number a is a witness for n if 


a” £a (mod n). 


This is an excellent name for a since, if the number n is trying to impersonate a 
prime, the prosecuting attorney can put a on the witness stand to prove that n is 
actually composite. 

If n is prime, then it obviously has no witnesses. The table on page 132, in 
which we have listed the witnesses for all numbers n up to 20, suggests that com- 
posite numbers tend to have quite a few witnesses. 

To further bolster this observation, we selected some random composite num- 
bers between 100 and 1000 and counted how many witnesses they have. We also 
give the percentage of the numbers between 1 and n that serve as witnesses. 


728 291 
# of witnesses | 278 150 310 582 908 804 t20) 282 
% of witnesses | 96.9% | 78.9% | 98.7% | 99.3% | 97.1% | 99.5% | 98.9% 96.9% | 


3“When I use a number,” Humpty Dumpty said in a rather scornful tone, “it means just what I 
choose it to mean—neither more nor less.” 
“The question is,” said Alice, “whether you can make numbers mean different things.” 
“The question is,” said Humpty Dumpty, “which is to be master—that’s all.” 
Or, as Hamlet was wont to say, “I am but mad north-north-west: when the wind is southerly I 
know a prime from a composite.” 
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It seems that if n is composite, then most values of a serve as witnesses. For ex- 
ample, if n = 287, and if we choose a random value for a, then there is a 96.9% 


n Witnesses for n 

3 prime 

4 218 

+) prime 

9 2535450; 0, ¢ 

10 2,354; 7859 

11 prime | 
12 2/35) Gay G01 | 
13 prime | 
14 2.3545:5,6,90, 10,1112, 13 

15 Ves Val gic cia Wa 

16 2S AO O 160s Oy LO ADS tA 16 

A prime 

18 Oy del Os Ope Oe LOLA TS PA Og lee 

19 prime 


B00 ||| S46 78 O10 TI 138.14 thls 9 


chance that a is a witness for the compositeness of n. Thus it will not take very 
many experiments to prove that n is composite. 

All our evidence and also common sense suggest that composite numbers have 
lots of witnesses. But is this really true? If we start to make a list of all numbers 
with their witnesses, we eventually run into the sad case of n = 561. This is a 
composite number, since 561 = 3-11-17, but unfortunately 561 doesn’t have 
even a single witness! One way to verify that 561 has no witnesses is to compute 
a” (mod n) for all 561 values of a. We take an easier approach. To prove that 


a°®! =a (mod 561), 
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it is enough to prove that 
a! =o (mod 3), @°'=e (mod 11), «and =a (mod 17), 


since if a number is divisible by 3, by 11, and by 17, then it is divisible by their 
product 3-11-17. For the first congruence, we observe that if 3 divides a then 
both sides are 0, while if 3 does not divide a, we can use Fermat’s Little Theorem 
a? = 1 (mod 3) to compute 


q561 — g2-280+1 _ (q2)280 


The second and third congruences are checked in a similar fashion. Thus ei- 
ther 11 divides a and both sides are 0 modulo 11, or else we use the congruence 
a’ = 1 (mod 11) to compute 


561 _ q10-56+1 _ (q10)56 


a -@=1-a=a (mod 11). 


Finally, either 17 divides a and both sides are equal to 0 modulo 17, or else we use 
a'® = 1 (mod 17) to compute 


q®6l — gl6-35+1 _ (41635. g =] .g =a (mod 17). 


Hence there are no witnesses for the composite number 561. 

This example and 14 others were first noted by R.D. Carmichael in 1910, so 
they are named in his honor. A Carmichael number is a composite number n with 
the property that 


a” = a (mod n) for every integer 1 <a <n. 


In other words, a Carmichael number is a composite number that can masquerade 
as a prime, because there are no witnesses to its composite nature. We have seen 
that 561 is a Carmichael number, and in fact it is the smallest one. 

Here is the complete list of all Carmichael numbers up to 10000. 


561, 1105, 1729, 2465, 2821, 6601, 8911. 


Factoring them, 


561) = 3-11" 17 2821 = 7 613+ 31 
1105 = 5-13-17 6601 = 7-23-41 
1729 = f 213-19 8911 = 7-19-67 


2465 = 5-17-29 
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we immediately observe that each number in our list is the product of three distinct 
odd primes. So we might make the conjecture that Carmichael numbers are always 
the product of three distinct odd primes. 

Our conjecture doesn’t fare too well, since 


62745 = 3-5- 47-89 


is a Carmichael number with four prime factors. This does not mean we should 
abandon our conjecture, merely that we must make some modifications. Notice 
that our conjecture was really three conjectures: that a Carmichael number has 
exactly three prime factors, that the prime factors are distinct, and that the prime 
factors are odd. So we drop the part that is false and state the other two parts 
separately: 


(A) Every Carmichael number is odd. 
(B) Every Carmichael number is a product of distinct primes. 


Let’s prove these two assertions. For (A), we use the Carmichael congruence 


a” = a (mod n) 


with a = n — 1 = —1 (mod n) to get 
(—1)” = -1 (mod n). 


This implies that n is odd (or n = 2). 
Next we prove (B). Suppose that n is a Carmichael number. Let p be a prime 
number dividing n, and let 


p°*! be the largest power of p dividing n. 
We want to show that e is 0. The fact that n is a Carmichael number means that 
a” = a (mod n) for every value of a. In particular, this is true for a = p°, so 


en — 


p” = p® (mod n). 


Thus n divides the difference p°” — p°, and by assumption p°t! divides n, so we 
conclude that 

p°*! divides p°*” — p®. 
Therefore, 


is an integer. 
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The only way that this can be true is if e = 0, which completes the proof of (B). 

The two properties (A) and (B) of Carmichael numbers are useful, but it would 
be even more useful if we could devise a simple method for checking whether 
or not a number is a Carmichael number. Our earlier verification that 561 is a 
Carmichael number provides a clue. Rather than verifying that a” = a (mod n), 
we instead checked that a” = a (mod p) for each prime p dividing a. Then the 
congruence modulo p was compared with Fermat’s Little Theorem to give us a 
relationship between p and n. The upshot is a criterion for Carmichael numbers 
that we now formally state and prove. 


Theorem 19.1 (Korselt’s Criterion for Carmichael Numbers). Let n be a composite 
number. Then n is a Carmichael number if and only if it is odd and every prime p 
dividing n satisfies the following two conditions: 

(1) p* does not divide n. 

(2) p—1 divides n — 1. 


Proof. Suppose first that n is a composite number, and further suppose that every 
prime divisor p of n satisfies conditions (1) and (2). We want to prove that n is 
a Carmichael number. Our proof uses the same arguments that we used to prove 
that 561 is a Carmichael number. 

We factor n as 


Fe PiP2p3*<* Pr 


into a product of primes. From condition (1) we know that p1, po,..., pr are all 
different. We also know from condition (2) that each p; — 1 divides n — 1, so for 
each 7 we can factor 


n—-1=(p,-—1)k, for some integer k;,. 


Now take any integer a. We compute the value of a” modulo p; as follows. First, 
if p; divides a, then clearly 


a” = 0 =a (mod pj). 
Otherwise p; does not divide a and we can use Fermat’s Little Theorem to compute 
a” = ghi-Ykit1 since n — 1 = (p; — 1)ki, 


= (aPi-1)* -a 


1%: - a (mod pi) by Fermat’s Little Theorem, which 
tells us that a?'~' = 1 (mod yj), 


Ill 


a (mod p;). 
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We have now proved that 


a” =a(mod p;) foreachi=1,2,...,r. 

In other words, a” — a is divisible by each of the primes 71, p2,...,p,, and hence 
it is divisible by their product n = p,p2---p,. (Notice that this is where we are 
using the fact that p1, p2,..., p,r are all different.) Therefore, 


a” =a(modn), 
and since we have shown that this is true for every integer a, we have completed 
the proof that n is a Carmichael number. 

This proves one half of Korselt’s Criterion: An odd composite number sat- 
isfying conditions (1) and (2) is a Carmichael number. For the other direction, 
we proved earlier that every Carmichael number satisfies condition (1), and in 
Exercise 19.1 we ask you to show that Carmichael numbers also satisfy condi- 
tion (2). ai 


To illustrate the power of Korselt’s Criterion, we verify that two of the exam- 
ples given previously are actually Carmichael numbers. First, Korselt’s Criterion 
tells us that 1729 = 7 - 13 - 19 is a Carmichael number, since 


1729 — 1 1729 —1 1729 — 1 
az a = 288, ae = 144, and 4 = 96; 
Second, 62745 = 3- 5-47-89 is a Carmichael number, since 
62745 — 1 62745 — 1 
62745 — 1 62745 — 1 
——_— = ——— = 713. 
Wo 1364, 39-1 T13 


In his 1910 paper, Carmichael conjectured that there are infinitely many Car- 
michael numbers. (Of course, he didn’t call them Carmichael numbers!) This 
conjecture remained unproved for more than 70 years. It was finally verified in 
1994 by W.R. Alford, A. Granville, and C. Pomerance. 

The fact that Carmichael numbers exist means that we need a better method for 
checking if a number is composite. The Rabin—Miller test for composite numbers 
is based on the following fact. 


Theorem 19.2 (A Property of Prime Numbers). Let p be an odd prime and write 


p—-1=2"q  withq odd. 
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Let a be any number not divisible by p. Then one of the following two conditions 
is true: 
(i) a? is congruent to 1 modulo p. 


4q gh=1q 


(ii) One of the numbers a4, a4, a“4,...,a is congruent to —1 modulo p. 


Proof. Fermat’s Little Theorem tells us that a?~! = 1 (mod p). This means that, 
when we look at the list of numbers 
k-1 k 
Ota Gliese Cae 
we know that the last number in the list is congruent to 1 modulo p (since 2g 


equals p — 1). Furthermore, each number in the list is the square of the previous 
number. Therefore, one of the following two possibilities must be true: 


(i) The first number in the list is congruent to 1 modulo p. 


(ii) Some number in the list is not congruent to 1 modulo p, but when squared, it 
becomes congruent to 1 modulo p. The only number fitting this description 
is —1 modulo p, so in this case the list contains —1 modulo p. 


This completes the proof. C 


Turning the preceding property of prime numbers on its head, we obtain a test 
for composite numbers called the Rabin—Miller test. Thus, if nm is an odd number 
and if n does not have the aforementioned prime number property, then we know 
it must be a composite number. Furthermore, if n does have the prime number 
property for a lot of different values of a, then it is likely that n is prime. 


Theorem 19.3 (Rabin—Miller Test for Composite Numbers). Let n be an odd inte- 
ger and write n — 1 = 2*q with q odd. If both of the following conditions are true 
for some a not divisible by n, then n is a composite number. 

(a) a? £1 (mod n) 

(b) a%¢ 4 —1 (mod n) for alli =0,1,...,k-1 


We have already verified that the Rabin—Miller test works, since if n satis- 
fies (a) and (b), then it does not satisfy the prime number property described in 
Theorem 19.2, so it must be composite. Note that the Rabin—Miller test is very 
fast and easy to implement on a computer, since, after computing a? (mod n), we 
simply compute a few squares modulo n. 

For any particular choice of a, the Rabin—Miller test either conclusively proves 
that n is composite or suggests that n might be prime. A Rabin—Miller witness for 
the compositeness of n is a number a for which the Rabin—Miller test successfully 
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proves that n is composite. The reason that the Rabin—Miller test is so useful is 
due to the following fact, which is proved in more advanced texts. 


If n is an odd composite number, then at least 
75% of the numbers a between 1 and n — 1 act 
as Rabin—Miller witnesses for n. 


In other words, every composite number has lots of Rabin—Miller witnesses to 
its compositeness, so there aren’t any “Carmichael-type numbers” for the Rabin— 
Miller test. 

For example, if we randomly choose 100 different values for a, and if none of 
them are Rabin—Miller witnesses for n, then the probability* of n being composite 
is less than 0.2519, which is approximately 6 - 10~°!. And if you feel that this is 
taking too much of a risk, you can always try another few hundred values for a. 
In practice, if n is composite, then just a few Rabin—Miller tests virtually always 
reveal this fact. 

To illustrate, we apply the Rabin—Miller test with a = 2 to the number n = 
561, which you may recall is a Carmichael number. We have n—1 = 560 = 24-35, 
SO we compute 


23° = 263 (mod 561), 
22:35 = 2637 = 166 (mod 561), 
9%? = 166" = 67 (mod 561), 
oP38 677 = 1 (mod 561); 


The first number 2*° (mod 561) is neither 1 nor —1, and the other numbers are 
not —1, so 2 is a Rabin—Miller witness to the fact that 561 is a composite number. 
As a second example, we take the larger number n = 172947529. We have 


n— 1 = 172947528 = 2° - 21618441. 
We apply the Rabin—Miller test with a = 17, and at the first step we get 
[77 etesl = (mod 172047529). 
So 17 is not a Rabin—Miller witness for n. Next we try a = 3, but unfortunately 


g7ieies Si (mod 172947529 |, 


“We have cheated a little bit. We really need to compute what is called a conditional probability, 
in this case the probability that n is composite given that 100 values of a fail to be witnesses. The 
correct bound for the probability is approximately 0.251°° - In(n). 
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so 3 also fails to be a Rabin—Miller witness. At this point we might suspect that n 
is prime, but if we try another value, such as a = 23, we find 


2371618441 — 49063806 (mod 172947529), 
3721618441 = 9957065 (mod 172947529), 
gge 2tolesel = 1 (mod 172947529), 


so 23 is a Rabin—Miller witness that n is actually composite. In fact, n is a Carmi- 
chael number, but it’s not so easy to factor (by hand). 


Exercises 


19.1. Let n be a Carmichael number and let p be a prime number that divides n. 

(a) Finish the proof of Korselt’s Criterion by proving that p — 1 divides n — 1. [Hint. 
We will prove in Chapter 28 that for every prime p there is a number g whose pow- 
ers g,g7,9°,.-.,g” | are all different modulo p. (Such a number is called a primitive 
root.) Try putting a = g into the Carmichael congruence a” = a (mod n).] 

(b) Prove that p — 1 actually divides the smaller number a -—1. 


19.2. Are there any Carmichael numbers that have only two prime factors? Either find an 
example or prove that none exists. 


19.3. Use Korselt’s Criterion to determine which of the following numbers are Carmichael 
numbers. 

(a) 1105 (b) 1235 (ec) 2821 (d) 6601 

(e) 8911 (f) 10659 (g) 19747 (h) 105545 

(i) 126217 (j) 162401 (k) 172081 (I) 188461 


19.4. Suppose that k is chosen so that the three numbers 
6k+1, 12k+1, 1841 


are all prime numbers. 
(a) Prove that their product n = (6k + 1)(12k + 1)(18k + 1) is a Carmichael number. 
(b) Find the first five values of k for which this method works and give the Carmichael 
numbers produced by the method. 


19.5. Find a Carmichael number that is the product of five primes. 


19.6. & (a) Write a computer program that uses Korselt’s Criterion to check if a num- 
ber n is a Carmichael number. 
(b) Earlier we listed all Carmichael numbers that are less than 10,000. Use your program 
to extend this list up to 100,000. 
(c) Use your program to find the smallest Carmichael number larger than 1,000,000. 
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19.7. (a) Let n = 1105, son — 1 = 24 - 69. Compute the values of 
2? (mod 1405): 12°" (mod 1105)) 227 (mod 1105). “252? (Guod 1105). 


and use the Rabin—Miller test to conclude that n is composite. 

(b) Use the Rabin—Miller test with a = 2 to prove that n = 294409 is composite. Then 
find a factorization of n and show that it is a Carmichael number. 

(c) Repeat (b) with n = 118901521. 


19.8. == Program the Rabin—Miller test with multiprecision integers and use it to inves- 
tigate which of the following numbers are composite. 

(a) 155196355420821961 

(b) 155196355420821889 

(c) 285707540662569884530199015485750433489 

(d) 28570754066256988453019901548575 1094149 


Chapter 20 


Squares Modulo p 


We learned long ago in Chapter 8 how to solve linear congruences, 
ax =c (mod m). 


It’s now time to take the plunge and move on to quadratic equations. We devote 
the next three chapters to answering the following types of questions: 


e Is 3 congruent to the square of some number modulo 7? 


e Does the congruence z* = —1 (mod 13) have a solution? 


e For which primes p does the congruence x? = 2 (mod p) have a solution? 
We can answer the first two questions right now. To see if 3 is congruent to the 
square of some number modulo 7, we just square each of the numbers from 0 to 6, 
reduce modulo 7, and see if any of them is equal to 3. Thus, 


0? = 0 (mod 7) 
1 =1 (mod?) 
2? =A (mod 7} 
3° = 2 (mod 7) 
4? = 2 (mod 7) 
5? = 4 (mod 7) 
6? = 1 (mod 7) 


So we see that 3 is not congruent to a square modulo 7. In a similar fashion, 
if we square each number from 0 to 12 and reduce modulo 13, we find that the 
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congruence x? = —1 (mod 13) has two solutions, x = 5 (mod 13) and x = 


8 (mod 13).! 

As always, we need to look at some data before we can even begin to look 
for patterns and make conjectures. Here are some tables giving all the squares 
modulo p for p = 5, 7, 11, and 13. 


b 
0 
2 
3 


CO; N 
oy 
j=) 


co 
oy 
i) 


ay 
fon) 
OO] & 


pan 
pa 
= 


—_ 
i) 
—_ 


Modulo 11 


Modulo 13 


Many interesting patterns are already apparent from these lists. For example, each 
number (other than 0) that appears as a square seems to appear exactly twice. 
Thus, 5 is both 4? and 7? modulo 11, and 3 is both 42 and 9? modulo 13. In 
fact, if we fold each list over in the middle, the same numbers appear as squares on 
the top and on the bottom. 

How can we describe this pattern with a formula? We are saying that the square 
of the number b and the square of the number p — b are the same modulo p. But 
now that we’ve described our pattern by a formula, it’s easy to prove. Thus, 


(p — b)? = p? — 2pb + b? =? (mod p). 


‘For many years during the nineteenth century, mathematicians were uneasy with the idea of the 
number \/—1. Its current appellation “imaginary number” still reflects that disquiet. But if you work 
modulo 13, for example, then there’s nothing mysterious about /—1. In fact, 5 and 8 are both square 
roots of —1 modulo 13. 
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So if we want to list all the (nonzero) numbers that are squares modulo p, we only 
need to compute half of them: 


2 
17 (mod p), 2?(modp), 37 (modp),..., (2) (mod p). 


Our goal is to find patterns that can be used to distinguish squares from nonsquares 
modulo p. Ultimately, we will be led to one of the most beautiful theorems in all 
number theory, the Law of Quadratic Reciprocity, but first we must perform the 
mundane task of assigning some names to the numbers we want to study. 


A nonzero number that is congruent to a square modulo p is called 
a quadratic residue modulo p. A number that is not congruent to a 
square modulo p is called a (quadratic) nonresidue modulo p. We 
abbreviate these long expressions by saying that a quadratic residue is 
a QR and a quadratic nonresidue is an NR. A number that is congruent 
to 0 modulo p is neither a residue nor a nonresidue. 


To illustrate this terminology using the data from our tables, 3 and 12 are QRs 
modulo 13, while 2 and 5 are NRs modulo 13. Note that 2 and 5 are NRs because 
they do not appear in the list of squares modulo 13. The full set of QRs modulo 13 
is {1,3, 4,9, 10, 12}, and the full set of NRs modulo 13 is {2, 5,6, 7,8, 11}. Simi- 
larly, the set of QRs modulo 7 is {1, 2, 4} and the set of NRs modulo 7 is {3, 5, 6}. 
Notice that there are six quadratic residues and six nonresidues modulo 13, 
and there are three quadratic residues and three nonresidues modulo 7. Using our 
earlier observation that (p — b)? = b? (mod p), we can easily verify that there are 
an equal number of quadratic residues and nonresidues modulo any (odd) prime. 


Theorem 20.1. Let p be an odd prime. Then there are exactly (p — 1)/2 quadratic 
residues modulo p and exactly (p — 1)/2 nonresidues modulo p. 


Proof. The quadratic residues are the nonzero numbers that are squares modulo p, 
so they are the numbers 


1727... (P= 1)" (hod p). 


But, as we noted above, we only need to go halfway, 


290 p-1\* 
tg teasars ae (mod p), 


since the same numbers are repeated in reverse order if we square the remaining 
numbers 


2 
(2) = 2)%, (= 1)? (aod p). 
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So in order to show that there are exactly (p — 1)/2 quadratic residues, we need to 
check that the numbers 17, 27,..., (2+) * are all different modulo D. 

Suppose that b; and bz are numbers between 1 and (p — 1)/2, and suppose that 
b? = b2 (mod p). We want to show that b; = bg. The fact that b? = b3 (mod p) 
means that 


p divides be _— b2 = (by = bz) (by + bo). 


However, b; + b2 is between 2 and p — 1, so it can’t be divisible by p. Thus p must 
divide b; — ba. But |b; — b2| < (p — 1)/2, so the only way for b; — be to be divis- 
ible by p is to have b} = by. This shows that the numbers 17, 2?,..., (ey are 
all different modulo p, so there are exactly (p — 1)/2 quadratic residues modulo p. 
Now we need only observe that there are p — 1 numbers between 1 and p — 1, so 
if half of them are quadratic residues, the other half must be nonresidues. [iF] 


Suppose that we take two quadratic residues and multiply them together. Do 
we get a QR or an NR, or do we sometimes get one and sometimes the other? 
For example, 3 and 10 are QRs modulo 13, and their product 3-10 = 30 = 
4 is again a QR modulo 13. Actually, this should have been clear without any 
computation, since if we multiply two squares, we should get a square. We can 
formally verify this in the following way. Suppose that a; and a2 are both QRs 
modulo p. This means that there are numbers 6, and bz such that a, = be (mod p) 
and aj = b% (mod p). Multiplying these two congruences together, we find that 
12 = (bib)? (mod p), which shows that a;a2 is a QR. 

The situation is less clear if we multiply a QR by an NR, or if we multiply two 
NRs together. Here are some examples using the data in our tables: 


QR x NR = ?? (mod p) NR x NR = ?? (mod p) 

2x 5 =3 (mod7) NR 3x 5. = 1 (mod'7). “QR 
5 x 6 =8(modll) NR 6 x 7 =9 (mod11) QR 
4 x 5 =7(mod13) NR 5 x 11 = 3 (mod13) QR 
10 x 7 =5(mod13) NR 7 x 11 =12(mod13) QR 


Thus, multiplying a quadratic residue and a nonresidue seems to yield a nonresidue, 
while the product of two nonresidues always seems to be a residue. Symbolically, 
we might write 


QRxQR=QR, QRxNR=NR, NRxNR=QR. 


We’ve already seen that the first relation is true, and we now verify the other two 
relations. 
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Theorem 20.2 (Quadratic Residue Multiplication Rule). (Version 1) Let p be an 
odd prime. Then: 

(i) The product of two quadratic residues modulo p is a quadratic residue. 

(ii) The product of a quadratic residue and a nonresidue is a nonresidue. 

(iii) The product of two nonresidues is a quadratic residue. 
These three rules can by summarized symbolically by the formulas 


QORxQOR=QOR, ORXNR=NR, NRXNR=OR. 


Proof. We have already seen that QR x QR = QR. Suppose next that a; is a QR, 
say a, = be (mod p), and that a2 is an NR. We are going to assume that aa is a 
QR and derive a contradiction. The assumption that a;a2 is a QR means that it is 
congruent to b2 for some b3, so we have 


b3 = ayaz = b2az (mod p). 


Note that gcd(b;,p) = 1, since p { a, and a, = 6?, so the Linear Congruence 
Theorem (Theorem 8.1) says that we can find an inverse for b; modulo p. In other 
words, we can find some c, such that c,b; = 1 (mod p). Multiplying both sides of 
the above congruence by c? gives 


c7b3 = ajay = (c1b1)°a9 = ag (mod p). 


Thus a2 = (c,b3)? (mod p) is a QR, contradicting the fact that a2 is a NR. This 
completes the proof that 
QR x NR = NR. 


We are left to deal with the product of two NRs. Let a be an NR and consider 
the set of values 
a, 2a, 3a,...,(p —2)a, (p—1)a_ (mod p). 


By an argument we’ve used before (see Lemma 9.2 on page 68), these are just the 
numbers 1,2,...,(p — 1) rearranged in some different order. In particular, they 
include the $(p — 1) QRs and the $(p — 1) NRs. However, as we already proved, 
each time that we multiply a by a QR, we get an NR, so the $(p — 1) products 


a xX QR 


already give us all 4(p — 1) NRs in the list. Hence when we multiply a by an 
NR, the only possibility is that it is equal to one of the QRs in the list, because the 
a x QR products have already used up all of the NRs in the list.” L 


«When you have eliminated all of the quadratic residues, the remaining numbers, no matter how 
improbable, must be the nonresidues!” (with apologies to Sherlock Holmes and Sir Arthur Conan 
Doyle). 
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This completes the proof of the quadratic residue multiplication rules. Now 
take a minute to stare at 


QRxQR=QOR, QRxNR=NR, NRXNR=QR. 


Do these rules remind you of anything? If not, here’s a hint. Suppose that we try 
to replace the symbols QR and NR with numbers. What numbers would work? 
That’s right, the symbol QR behaves like +1 and the symbol NR behaves like —1. 
Notice that the somewhat mysterious third rule, the one that says that the product 
of two nonresidues is a quadratic residue, reflects the equally mysterious rule* 


Having observed that QRs behave like +1 and NRs behave like —1, Adrien- 
Marie Legendre introduced the following useful notation. 


The Legendre symbol of a modulo p is 
(<) ‘ if a is a quadratic residue modulo p, 


—1 ifaisanonresidue modulo p. 


For example, data from our earlier tables says that 


GG O-» Gs 


Using the Legendre symbol, our quadratic residue multiplication rules can be given 
by a single formula. 


Theorem 20.3 (Quadratic Residue Multiplication Rule). (Version 2) Let p be an 


Pp Pp Pp 


The Legendre symbol is useful for making calculations. For example, suppose 
that we want to know if 75 is a square modulo 97. We can compute 


(sz) = (az) = (Gr) (Ge) Ge) = Ge) 


You may no longer consider the formula (—1) x (—1) = +1 mysterious, since it’s so familiar 
to you. But you should have found it mysterious the first time you saw it. And if you stop to think 
about it, there is no obvious reason why the product of two negative numbers should equal a positive 
number. Can you come up with a convincing argument that (—1) x (—1) must equal +1? 
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Notice that it doesn’t matter whether (=) is +1 or —1, since it appears twice, and 


(+1)? = (—1)*% = 1. Now we observe that 10? = 3 (mod 97), so 3 is a QR. 


Hence, 
De ee, 
97) \97) — 


Of course, we were lucky in being able to recognize 3 as a QR modulo 97. Is there 
some way to evaluate a Legendre symbol like (¢) without relying on luck or trial 
and error? The answer is yes, but that’s a topic for another chapter. 


Exercises 


20.1. Make a list of all the quadratic residues and all the nonresidues modulo 19. 
20.2. For each odd prime p, we consider the two numbers 


A = sum of all 1 < a < p such that a is a quadratic residue modulo p, 


B = sum of all 1 < a < psuch that a is a nonresidue modulo p. 
For example, if p = 11, then the quadratic residues are 


1-1 (mod 11), 2? = 4 (mod 11), 3? = 9 (mod 11), 
4° = 5 (mod 11), 5? = 3 (mod 11), 


sO 
A= 1490 88 = 22 and B=2+6+7+8+10=33. 


(a) Make a list of A and B for all odd primes p < 20. 

(b) What is the value of A + B? Prove that your guess is correct. 

(c) Compute A mod p and B mod p. Find a pattern and prove that it is correct. [Hint. 
See Exercise 7.4 for a formula for 1? + 2? + --- + n? that might be useful.] 

(d) Compile some more data and give a criterion on p which ensures that A = B. After 
reading Chapter 21, you will be asked to prove your criterion. 


(e) B Write a computer program to compute A and B, and use it to make a table for 
all odd p < 100. If A # B, which one tends to be larger, A or B? Try to prove that 
your guess is correct, but be forewarned that this is a very difficult problem. 


20.3. A number a is called a cubic residue modulo p if it is congruent to a cube modulo p, 
that is, if there is a number b such that a = b° (mod p). 
(a) Make a list of all the cubic residues modulo 5, modulo 7, modulo 11, and modulo 13. 
(b) Find two numbers a, and 6; such that neither a; nor 6b; is a cubic residue modulo 19, 
but a,b; is a cubic residue modulo 19. Similarly, find two numbers a2 and bz such 
that none of the three numbers aa, b2, or a2b2 is a cubic residue modulo 19. 
(c) If p = 2 (mod 3), make a conjecture as to which a’s are cubic residues. Prove that 
your conjecture is correct. 


Chapter 21 


Is —1 a Square Modulo p? 
Is 2? 


In the previous chapter we took various primes p and looked at the a’s that were 
quadratic residues and the a’s that were nonresidues. For example, we made a table 
of squares modulo 13 and used the table to see that 3 and 12 are QRs modulo 13, 
while 2 and 5 are NRs modulo 13. 

In keeping with all of the best traditions of mathematics, we now turn this 
problem on its head. Rather than taking a particular prime p and listing the a’s that 
are QRs and NRs, we instead fix an a and ask for which primes p is a a QR. To 
make it clear exactly what we’re asking, we start with the particular value a = —1. 
The question that we want to answer is as follows: 


For which primes p is —1 a QR? 


We can rephrase this question in other ways, such as “For which primes p does 
the congruence x* = —1 (mod p) have a solution?” and “For which primes p is 
a 

Ke always, we need some data before we can make any hypotheses. We can 
answer our question for small primes in the usual mindless way by making a table 
ee a eee (mod p) and checking if any of the numbers are congruent to —1 
modulo p. So, for example, —1 is not a square modulo 3, since 1? # —1 (mod 3) 
and 2? #4 —1 (mod 3), while —1 is a square modulo 5, since 2? = —1 (mod 5). 
Here’s a more extensive list. 


pe 8 | 8 | 7 [11] 13] 17 [19] 23) 29 | 31 
aa cay NR|2,3]NR]NR|5,8) 4,13, NR|NR|12,17|NR 
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Reading from this table, we compile the following data: 


—1 is a quadratic residue for p = 5, 13, 17, 29. 
—1 1s anonresidue for p = 3, 7,11, 19, 23, 31. 


It’s not hard to discern the pattern. If p is congruent to 1 modulo 4, then —1 
seems to be a quadratic residue modulo p, and if p is congruent to 3 modulo 4, 
then —1 seems to be a nonresidue. We can express this guess using Legendre 


symbols, 
(=) @ If. tp S14mod4), 
p)  )-1 ifp=3 (mod 4). 
Let’s check our conjecture on the next few cases. The next two primes, 37 and 41, 
are both congruent to 1 modulo 4 and, sure enough, 


x” = —1 (mod 37) has the solutions x = 6 and 31 (mod 37), and 
x* = —1 (mod 41) has the solutions x = 9 and 32 (mod 41). 


Similarly, the next two primes 43 and 47 are congruent to 3 modulo 4, and we 
check that —1 is a nonresidue for 43 and 47. Our guess is looking good! 

The tool that we use to verify our conjecture might be called the “Square Root 
of Fermat’s Little Theorem.” How, you may well ask, does one take the square root 
of a theorem? Recall that Fermat’s Little Theorem (Chapter 9) says 


We won’t really be taking the square root of this theorem, of course. Instead, we 
take the square root of the quantity a?~! and ask for its value. So we want to 
answer the following question: 


Let A = a?—)/2, What is 
the value of A modulo p? 


One thing is obvious. If we square A, then Fermat’s Little Theorem tells us that 
A” = a?! =1 (mod p). 


Hence, p divides A? — 1 = (A — 1)(A + 1), so either p divides A — 1 or p di- 
vides A + 1. (Notice how we are using Lemma 7.1, which is the property of prime 
numbers that we proved on page 46.) Thus A must be congruent to either +1 
(6 Re 
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Here are a few random values of p, a, and A. For comparison purposes, we 
have also included the value of the Legendre symbol @): Do you see a pattern? 


It certainly appears that A = 1 (mod :p) when a is a quadratic residue and that 
A = —1 (mod p) when a is a nonresidue. In other words, it looks like A (mod p) 
has the same value as the Legendre symbol (2) . We use a counting argument to ver- 
ify this assertion, which goes by the name of Euler’s Criterion. [For an alternative 
proof of this important result, see Exercise 28.8(c).] 


Theorem 21.1 (Euler’s Criterion). Let p be an odd prime. Then 


rs a (2) (mod p). 
Pp 


Proof. Suppose first that a is a quadratic residue, say a = b? (mod p). Then Fer- 
mat’s Little Theorem (Theorem 9.1) tells us that 


q(P-1)/2 = (Gee = pel =a | (mod p). 


Hence a?-))/? = (5) (mod p), which is Euler’s Criterion when a is a quadratic 
residue. 
We next consider the congruence 


Keay O(modp), 


We have just proven that every quadratic residue is a solution to this congruence, 
and we know from Theorem 20.1 that there are exactly 4(p — 1) distinct quadratic 
residues. We also know from the Polynomial Roots Mod p Theorem (Theorem 8.2 
on page 60) that this polynomial congruence can have at most $(p — 1) distinct 
solutions. Hence 


{solutions to X we 1)/2- 1 = 0 (nod p) } = {quadratic residues modulo p}. 


Now let a be a nonresidue. Fermat’s Little Theorem tells us that a?~! = 
1 (mod p), so 


OSa haha Geo) (a- 1) God p): 
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The first factor is not zero modulo p, because we already showed that the solutions 
to X”-)/2 _ 1 = 0 (mod p) are the quadratic residues. Hence the second factor 
must vanish modulo p, so 


q(P-1)/2 =--]jJ= (<) (mod p). 


This shows that Euler’s Criterion is also true for nonresidues. [= 


Using Euler’s Criterion, it is very easy to determine if —1 is a quadratic residue 
modulo p. For example, if we want to know whether —1 is a square modulo the 
prime p = 6911, we just need to compute 


(—1) (6911-1)/2 (ae ae, 


Euler’s Criterion then tells us that 


—1 
—— | = —1 (mod 6911). 
6911 
But (=) is always either +1 or —1, so in this case we must have (asia) = -—l. 


Hence, —1 is a nonresidue modulo 6911. 
Similarly, for the prime p = 7817 we find that 


Serb = (aa ee" = ite 


Hence, (=) = 1, so —1 is a quadratic residue modulo 7817. Observe that, 
although we now know that the congruence 


a? = —1 (mod 7817) 


has a solution, we still don’t have any efficient way to find a solution. The solutions 
turn out to be x = 2564 (mod 7817) and x = 5253 (mod 7817). 

As these two examples make clear, Euler’s Criterion can be used to determine 
exactly which primes have —1 as a quadratic residue. This elegant result, which 
answers the initial question in the title of this chapter, is the first part of the Law of 
Quadratic Reciprocity. 


Theorem 21.2 (Quadratic Reciprocity). (Part I) Let p be an odd prime. Then 


—1 is a quadratic residue modulo p_ if p =1 (mod 4), and 


—1 is anonresidue modulo p if p =3 (mod 4). 


In other words, using the Legendre symbol, 


(=) - 1 ifp=1 (mod 4), 
oe ie ee if p = 3 (mod 4). 
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Proof. Euler’s Criterion says that 


(1-7 = (=) (mod p). 


Suppose first that p = 1 (mod 4), say p = 4k + 1. Then 
(1/00? = (1) =41, 50° = (=) (mod p). 
p 


But (=") is either +1 or —1, so it must equal 1. This proves that if p = 1 (mod 4) 
then (=) = 1. 
Next we suppose that p = 3 (mod 4), say p = 4k + 3. Then 


Wal) Ae fe er so —-l= = mo é 
(—1)@-D/? = (1+) = 1, 1= (=) ap) 


This shows that (=) must equal —1, which completes the proof of Quadratic Reci- 
procity (Part I). C 


We can use the first part of Quadratic Reciprocity to answer a question left 
over from Chapter 12. As you may recall, we showed that there are infinitely many 
primes that are congruent to 3 modulo 4, but we left unanswered the analogous 
question for primes congruent to 1 modulo 4. 


Theorem 21.3 (Primes 1 (Mod 4) Theorem). There are infinitely many primes that 
are congruent to 1 modulo 4. 


Proof. Suppose we are given a list of primes p1, p2,..., pr, all of which are con- 

gruent to 1 modulo 4. We are going to find a new prime, not in our list, that is 

congruent to 1 modulo 4. Repeating this process gives a list of any desired length. 
Consider the number 


A = (2pipe---pr)* +1. 
We know that A can be factored into a product of primes, say 
A= 192°" Qs- 


It is clear that qi, q2,...,@s are not in our original list, since none of the p,;’s di- 
vide A. So all we need to do is show that at least one of the q;’s is congruent to 1 
modulo 4. In fact, we’ll see that all of them are. 
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First we note that A is odd, so all the q,;’s are odd. Next, each q; divides A, so 
(2p1p2--- pr)? +1 = A =0 (mod q). 
This means that x = 2p) p2-- +p, is a solution to the congruence 
a”? = —1 (mod q), 


so —1 is a quadratic residue modulo g;. Now Quadratic Reciprocity tells us that 
qi = 1 (mod 4). O 


We can use the procedure described in this proof to produce a list of primes 
that are congruent to 1 modulo 4. Thus, if we start with pj = 5, then we form 
A = (2p,)? + 1 = 101, so our second prime is pz = 101. Then 


A = (2pip2)? + 1 = 1020101, 
which is again prime, so our third prime is p3 = 1020101. We’ll go one more step, 


A = (2pipep3)? + 1 
= 1061522231810040101 
= 53-1613 - 12417062216309. 


Notice that all the primes 53, 1613, and 12417062216309 are congruent to 1 mod- 
ulo 4, just as predicted by the theory. 

Having successfully answered the first question in the title of this chapter, we 
move on to the second question and consider a = 2, the “oddest” of all primes. 
Just as we did with a = —1, we are looking for some simple characterization 
for the primes p such that 2 is a quadratic residue modulo p. Can you find the 
pattern in the following data, where the line labeled x” = 2 gives the solutions to 
az? = 2 (mod p) if 2 is a quadratic residue modulo p and is marked NR if 2 is a 
nonresidue? 


Dp 37 | 41 43 | 47 |53| 59 | 61) 67 | 71 73 


> [s 


Pp 
x? = 2/19, 70 
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Here’s the list of primes separated according to whether 2 is a residue or a non- 
residue. 


2 is a quadratic residue for p = 7,17, 23, 31, 41, 47, 71, 73, 
(9589597, (03.113 127 
2 is a nonresidue for p = 3, 5, 11, 13, 19, 29, 37, 43, 53, 59, 
61,67, 83, 101, 107, 109 


For a = —1, it turned out that the congruence class of p modulo 4 was crucial. 
Is there a similar pattern if we reduce these two lists of primes modulo 4? Here’s 
what happens if we do. 


T7285 3104147, 71,273, 79, 890%. 103-113; 197 

Sol oys ls, 8 las oeieg (mod): 
3,5, 11, 13, 19, 29, 37, 43, 53, 59, 61, 67, 83, 101, 107, 109 

=o, 31911 35,371,331 3, 1 anods): 


This doesn’t look too promising. Maybe we should try reducing modulo 3. 


T1723; 3141647717379, 80,97. 103,113,197 
42,019.29 1? 1d cl (od 3) 
3,5, 11, 13, 19, 29, 37, 43, 53, 59, 61, 67, 83, 101, 107, 109 
= 10822 1 1991 2, 2 Oe God 3) 


This doesn’t look any better. Let’s make one more attempt before we give up. What 
happens if we reduce modulo 8? 


TVG 28, 31A1, 4771s 73, 79. 809% 1031195197 
TTA Gols Plplatele mods) 
3,5, 11, 13, 19, 29, 37, 43, 53, 59, 61, 67, 83, 101, 107, 109 
= 3,5,3,5,3,5,5, 3,5, 3, 5,3, 3, 5,3, 5 (mod 8). 


Eureka! It surely can’t be a coincidence that the first line is all 1’s and 7’s and the 
second line is all 3’s and 5’s. This suggests the general rule that 2 is a quadratic 
residue modulo p if p is congruent to 1 or 7 modulo 8 and that 2 is a nonresidue if p 
is congruent to 3 or 5 modulo 8. In terms of Legendre symbols, we would write 


(=) e IL if p= 1 or7 (mod 8), 
—1 ifp=3or5 (mod 8). 
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Can we use Euler’s Criterion to verify our guess? Unfortunately, the answer 
is no, or at least not in any obvious way, since there doesn’t seem to be an easy 
method to calculate 2-1/2 (mod p). However, if you go back and examine our 
proof of Fermat’s Little Theorem in Chapter 9, you’ll see that we took the numbers 
1,2,..., — 1, multiplied each one by a, and then multiplied them all together. 
This gave us a factor of a?—! to pull out. In order to use Euler’s Criterion, we only 
want $(p — 1) factors of a to pull out, so rather than starting with all of the numbers 
from 1 to p, we just take the numbers from 1 to $(p — 1). We illustrate this idea, 
which is due to Gauss, to determine if 2 is a quadratic residue modulo 13. 

We begin with half the numbers from 1 to 12: 1,2,3,4,5,6. If we multiply 
each by 2 and then multiply them together, we get 


2-4-6-8-10-12 = (2-1)(2-2)(2-3)(2-4)(2- 5)(2- 6) 
= Pt 2 23 diab 6 
= 29" 6): 
Notice the factor of 26 = 2('3-1)/2, which is the number we’re really interested in. 
Gauss’s idea is to take the numbers 2, 4,6, 8, 10,12 and reduce each of them 


modulo 13 to get a number lying between —6 and 6. The first three stay the same, 
but we need to subtract 13 from the last three to get them into this range. Thus, 


2 = 2 (mod 13) 4 = 4 (mod 13) 6 = 6 (mod 13) 
8 = —5 (mod 13) 10 = —3 (mod 13) 12 = —1 (mod 13). 


Multiplying these numbers together, we find that 
2-4-6-8-10-12=2-4-6- (—5) - (—3)- (-1) 


(a1)? 4658 el 
—6! (mod 13). 


Equating these two values of 2-4-6-8- 10-12 (mod 13), we see that 
2° . 6! = —6! (mod 13). 


This implies that 2° = —1 (mod 13), so Euler’s Criterion tells us that 2 is a non- 
residue modulo 13. 

Let’s briefly use the same ideas to check if 2 is a quadratic residue modulo 17. 
We take the numbers from 1 to 8, multiply each by 2, multiply them together, and 
calculate the product in two different ways. The first way gives 


Deh Ge Bel. 10 elds Moe? BI 
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For the second way, we reduce modulo 17 to bring the numbers into the range 
from —8 to 8. Thus, 


2 = 2 (mod 17) 4 =4 (mod 17) 6 = 6 (mod 17) 
8 = 8 (mod 17) 10 = —7 (mod 17) 12 = —5 (mod 17) 
14 = —3 (mod 17) 16 = —1 (mod 17). 


Multiplying these together gives 


2-4-6-8-10-12-14-16=2-4-6-8-(—7) - (—5) - (—3) -(—1) 
= (—1)*- 8! (mod 17). 


Therefore, 2° - 8! = (—1)* - 8! (mod 17), so 28 = 1 (mod 17), and hence 2 is a 
quadratic residue modulo 17. 

Now let’s think about Gauss’s method a little more generally. Let p be any odd 
prime. To make our formulas simpler, we let 


pee 
2 
We start with the even numbers 2, 4,6,...,p — 1. Multiplying them together and 
factoring out a 2 from each number gives 


2-46-.+(p—1) =2-DP.1.9.3..-P—* = oP. pI 


The next step is to take the list 2,4,6,...,p — 1 and reduce each number mod- 
ulo p so that it lies in the range from —P to P, that is, between —(p — 1)/2 and 
(p — 1)/2. The first few numbers won’t change, but at some point in the list we’ll 
start hitting numbers that are larger than (p — 1)/2, and each of these large num- 
bers needs to have p subtracted from it. Notice that the number of minus signs 
introduced is exactly the number of times we need to subtract p. In other words, 


Number of integers in the list 
Number of minus signs = PAGS cas Dal) 
that are larger than $(p — 1) 


The following illustration may help to explain this procedure. 


2-4-6-8-10-12--. | ---(p—5)- (p—3)-(p—1) 
—=—=_—_—_———— 
Numbers < (p — 1)/2 Numbers > (p — 1)/2. 
are left unchanged. Need to subtract p from each. 


Comparing the two products, we get 


Jair 2) pea) ee, Veg Wee (p—1)= (—1) umber of minus signs) | py (mod p), 
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so canceling P! from each side gives the fundamental formula 


9(p-1)/2 — (=)) cum of minus signs) (mod p). 


Using this formula, it is easy to verify our earlier guess, thereby answering the 
second question in the chapter title. 


Theorem 21.4 (Quadratic Reciprocity). (Part II) Let p be an odd prime. Then 2 
is a quadratic residue modulo p if p is congruent to 1 or 7 modulo 8, and 2 is a 
nonresidue modulo p if p is congruent to 3 or 5 modulo 8. In terms of the Legendre 
symbol, 
(=) - 1 ifp=lor7 (mod 8), 

~ )-1 ifp =30r5 (mod 8). 
Proof. There are actually four cases to consider, depending on the value of p mod- 
ulo 8. We do two of them and leave the other two for you to do. 

We start with the case that p = 3 (mod 8), say p = 8k + 3. We need to list the 
numbers 2, 4,...,— 1 and determine how many of them are larger than 5(p —1). 
In this case, p — 1 = 8k + 2 and $(p — 1) = 4k +1, so the cutoff is as indicated 
in the following diagram: 


2 AKG 22AR 1 SAR D) GRD) (8k 2). 


We need to count how many numbers there are to the right of the vertical bar. In 
other words, how many even numbers are there between 4k + 2 and 8k + 2? The 
answer is 2k + 1. (If this isn’t clear to you, try a few values for k and you’ll see 
why it’s correct.) This shows that there are 2k + 1 minus signs, so the fundamental 
formula given above tells us that 


ee (oh =i 1 (med py): 


Now Euler’s Criterion says that 2 is a nonresidue, so we have proved that 2 is a 
nonresidue for any prime p that is congruent to 3 modulo 8. 

Next let’s look at the primes that are congruent to 7 modulo 8, say p = 8k + 7. 
Now the even numbers 2,4,..., — 1 are the numbers from 2 to 8k + 6, and the 
midpoint is $(p — 1) = 4k + 3. The cutoff in this case is 


2-4-6---(4k 4+ 2) (4k + 4) - (4k + 6)--- (8k 4 6). 
There are exactly 2k + 2 numbers to the right of the vertical bar, so we get 2k + 2 
minus signs. This yields 
goa We = G2" 14 = 1 (mod 


so Euler’s Criterion tells us that 2 is a quadratic residue. This proves that 2 is a 
quadratic residue for any prime p that is congruent to 7 modulo 8. L 
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Exercises 


21.1. Determine whether each of the following congruences has a solution. (All of the 
moduli are primes.) 

(a) x? = —1 (mod 5987) (c) 2% +142 — 35 =0 (mod 337) 

(b) 2x? = 6780 (mod 6781) (d) x? — 64x + 943 = 0 (mod 3011) 
[Hint. For (c), use the quadratic formula to find out what number you need to take the 
square root of modulo 337, and similarly for (d).] 


21.2. Use the procedure described in the Primes 1 (Mod 4) Theorem to generate a list of 
primes congruent to 1 modulo 4, starting with the seed p, = 17. 


21.3. Here is a list of the first few primes for which 3 is a quadratic residue and a non- 
residue. 


Quadratic Residue: p = 11, 13, 23, 37, 47, 59, 61, 71, 73, 83, 97, 107, 109 
Nonresidue: p= 5,7,17,19, 29, 31, 41, 43, 53, 67, 79, 89, 101, 103, 113, 127 


Try reducing this list modulo m for various m’s until you find a pattern, and make a con- 
jecture explaining which primes have 3 as a quadratic residue. 


21.4. Finish the proof of Quadratic Reciprocity (Part II) for the other two cases: primes 
congruent to 1 modulo 8 and primes congruent to 5 modulo 8. 


21.5. Use the same ideas we used to verify Quadratic Reciprocity (Part II) to verify the 
following two assertions. 

(a) If p is congruent to 1 modulo 5, then 5 is a quadratic residue modulo p. 

(b) If p is congruent to 2 modulo 5, then 5 is a nonresidue modulo p. 
[Hint. Reduce the numbers 5,10,15,..., 2(p — 1) so that they lie in the range from 
—+(p — 1) to $(p — 1) and check how many of them are negative. ] 


21.6. In Exercise 20.2 we defined A and B to be the sums of the residues, respectively 
nonresidues, modulo p. Part (d) of that exercise asked you to find a condition on p which 
implies that A = B. Using the material in this section, prove that your criterion is correct. 
[Hint. The important fact you’ll need is the condition for —1 to be a quadratic residue. ] 


Chapter 22 


Quadratic Reciprocity 


Our current quest is to determine, for a given number a, exactly which primes p 
have a as a quadratic residue. In the previous chapter we solved this problem for 
a = —1 anda = 2. In both cases we found that we could determine whether a is 
a quadratic residue modulo p by looking at p modulo m for some small m, more 
specifically form = 4 orm = 8. 

Now we want to tackle the question of the Legendre symbol j) for other values 
of a. For example, suppose we want to compute (2). We can use the Quadratic 
Residue Multiplication Rules (Chapter 20) to compute 


(= _ feraery fer fo\ fi 
4 7 ( p 7 A () (5): 
We already know how to find (Z) , so we’re left with the problem of determining (°) 
and (7). 

In general, if we want to compute (5) for any number a, we can start by factor- 
ing a into a product of primes, say 


@ = G192°** Gr. 
(It’s okay if some of the q;’s are the same.) Then the Quadratic Residue Multipli- 


cation Rules give 
)-)@)-@) 
p P/\P Dp 


The moral of this story: If we know how to compute () for primes q, then we 


know how to compute (5) for every a.! Since nothing we have done so far tells us 

"Yet another instance of the principle that primes are the basic building blocks of number theory, 
so if you can solve a problem for primes, you’re usually well on your way to solving it for all 
numbers. 


[Chap. 22] Quadratic Reciprocity 160 


anything about () (for fixed q and varying 7), the time has come? to compile some 
data and use it to make some conjectures. The following table gives the value of 


the Legendre symbol () for all odd primes p,q < 37. 


The Value of the Legendre Symbol () 

Before reading further, you should take some time to study this table and try to 
find some patterns. Don’t worry if you don’t immediately discover the answer; the 
most important pattern concealed in this table is somewhat subtle. But you will 
find that it is well worth the effort to uncover the design on your own, since you 
then share the thrill of discovery with Legendre and Gauss. 

Now that you’ve formulated your own conjectures, we’ll examine the table 
together. We are going to compare the rows with the columns or, what amounts 
to the same thing, we are going to compare the entries when we reflect across the 
diagonal of the table. For example, the row with p = 5 reads 


“The time has come,” the Walrus said, “to talk of many things, of shoes, and primes, and 
residues, and cabbages and kings.” 
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Similarly, the column with g = 5 (turned sideways to save space) is 


for all primes p. Do you see how useful a rule like this would be? We are looking 
for a method to calculate the Legendre symbol (2) a difficult problem, but the 


Legendre symbol (=) is easy to compute, because it only depends on p modulo 5. 
In other words, we know that 


(2) _ Jl ifp=1or4 (mod 5), 
5) |-1 ifp=2o0r3(mod5). 
So if our guess that (°) = (=) is correct, then we would know, for example, that 5 


is a nonresidue modulo 3593, since 
2 ee fuck ee eee 
0a; }§§ BY Vy 
5 3889 ee fe: ee 
3889 5 i es 


so 5 should be a quadratic residue modulo 3889, and sure enough we find that 
5 = 29012 (mod 3889). 
Emboldened by this success, we might guess that 


)-0 


for all primes p and qg. Unfortunately, this isn’t even true for the first row and 
column of the table. For example, 


Qj = Qs 


So sometimes (7) is equal to (F). and sometimes it is equal to — (F). The following 
table will help us find a rule explaining when they are the same and when they are 


Similarly, 
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opposites. 


5 


3) ] BG] >) ] BL Gl 


F “ce (Wd ‘Dp “ep /G p 
Table with O if () — (3) and > if (5) = —(9) 
Looking at this table, we can pick out the primes that have Q-filled rows and 
columns: 
p = 5, 13, 17, 29, 37. 


The primes whose rows and columns are not exactly the same (i.e., the rows and 
columns containing %’s) are 


D355 1110.23, 915 


With our previous experience, there is no mystery about these lists; the former 
consists of the primes that are congruent to 1 modulo 4, and the latter contains the 
primes that are congruent to 3 modulo 4. 

So our first conjecture might be that if p = 1 (mod 4) or if g = 1 (mod 4) 
then the rows and columns are the same. We can write this in terms of Legendre 
symbols. 


Conjecture: If p= 1 (mod 4) org = 1 (mod 4), then (2) = (). 
P 
What happens if both p and q are congruent to 3 modulo 4? Looking at the 


table, we find in every instance that () and (F) are opposites. So we are led to 
make a further guess. 
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Conjecture: If p = 3 (mod 4) and q = 3 (mod 4), then (2) =— (2), 
Pp 


These two conjectural relations form the heart of the Law of Quadratic Reciprocity. 


Theorem 22.1 (Law of Quadratic Reciprocity). Let p and q be distinct odd primes. 
(=) _ Jj 1 ifp=1(mod 4), 
p) )-1 ifp=3(mod 4). 


(5) 1 ifp=lor7 (mod 8), 
~ |-1 ifp=30r5 (mod 8). 


if p = 1 (mod 4) org = 1 (mod 4), 


if p = 3 (mod 4) and q = 3 (mod 4). 


= os 
SIR 
eae 
\| 
QS x2/3 


We have proven the Law of Quadratic Reciprocity for (>) and (2). There are 


many different proofs of the relationship between () and (7). but none of them 
is easy. We will give a proof, due to Eisenstein, in the next chapter. Euler and 
Lagrange were the first to formulate the Law of Quadratic Reciprocity, but it re- 
mained for Gauss to give the first proof in his famous monograph Disquisitiones 
arithmeticae in 1801. Gauss discovered the law for himself when he was 19, and 
during his lifetime he found seven different proofs! Mathematicians during the 
nineteenth century subsequently formulated and proved Cubic and Quartic Reci- 
procity Laws, and these in turn were subsumed into the Class Field Theory devel- 
oped by David Hilbert, Emil Artin, and others from the 1890s through the 1920s 
and 1930s. During the 1960s and 1970s a number of mathematicians formulated a 
series of conjectures that vastly generalize Class Field Theory and that today go by 
the name of the Langlands Program. The fundamental theorem proved by Andrew 
Wiles in 1995 is a small piece of the Langlands Program, yet it sufficed to solve 
Fermat’s 350-year-old “Last Theorem.” 


Carl Friedrich Gauss (1777-1855) Carl Friedrich Gauss was one of the 
greatest mathematicians of all time, and arguably the finest number theorist 
to have ever lived. As a child, he was a mathematical prodigy whose feats 
impressed his family, friends, and teachers, and his mathematical talents only 
grew as he matured. His most influential work in number theory was pub- 
lished in 1801 under the title of Disquisitiones arithmeticae. It contains, 
among other things, the theory of quadratic reciprocity and the representation 
of numbers by binary forms. Much of the material in Gauss’s Disquisitiones 
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was far ahead of its time and, as such, furnished paths for number theorists 
to follow during the subsequent century and a half. In addition to his work 
in number theory, Gauss made fundamental contributions to many other ar- 
eas of mathematics, including geometry and differential equations. He also 
made many discoveries in physics and astronomy, including a method for 
computing orbits that he used to compute the position of the newly discov- 
ered asteroid Ceres in 1801. He published major papers in areas as diverse 
as crystallography, optics, and the physics of fluids, and he invented an elec- 
tromagnetic telegraph with Wilhelm Weber in 1833. He published 155 titles 
during his lifetime, but his life’s work was so prodigious that his Collected 
Works appeared during the period 1863 to 1933. 


The Law of Quadratic Reciprocity is not only a beautiful and subtle theoreti- 
cal statement about numbers, it is also a practical tool for determining whether a 
number is a quadratic residue. Essentially, it lets us flip the Legendre symbol () 


and replace it by + (5) . Then we can reduce p modulo gq and repeat the process. 
This leads to Legendre symbols with smaller and smaller entries, so eventually we 
arrive at Legendre symbols that we can compute. Here’s an example with detailed 
justification for each step. 


14 2 7 
—— } = {| —_} | —_— Quadratic Residue Multiplication Rule, 
137 137/ \137 
fat Quadratic Reciprocity says oe 1 
~ \i37 ae 6 a 
since 137 = 1 (mod 8), 
if 
= (+) Quadratic Reciprocity and 137 = 1 (mod 4), 
4 
a (=) reducing 137 modulo 7, 
oil since 4 = 2? is certainly a square. 


Thus, 14 is a quadratic residue modulo 137. In fact, the solutions to the congruence 
x? = 14 (mod 137) are x = 39 (mod 137) and x = 98 (mod 137). 
Here’s a second example that illustrates how the sign can change back and forth 
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a number of times. 


Gr) = Ga ) 


1 
yx (F =) since 5 = 1 (mod 4) and 
11 = 179 = 3 (mod 4), 


= 
-(F s)x(- yx (F *) since 179 = 4 (mod 5) and 
179 = 3 (mod 11), 
=1x(-1) x (=) since 4 = 2? is a square, 
1 : 
= 1x (-1) x (-1) x (=) since 3 = 11 =3 (mod 4), 


= 1x (-1) x (-1) x (5) since 11 = 2 (mod 3), 
) 


since 2 is anonresidue mod 3, 


So 55 is a nonresidue modulo 179. 

There is often more than one way to use Quadratic Reciprocity to evaluate a 
given Legendre symbol GE for example, by using the equality @) = (|) . Thus 
we can compute (522) as 


2-2) -D@)-OE) 
QE) -G)-@ 1-0 


or we can compute it as 


2) = (38) ANE) --o-e=— 


Of course, regardless of the path taken, the final destination is always the same. 
The Law of Quadratic Reciprocity furnishes an extremely efficient way to com- 
pute the Legendre symbol iS) even for very large values of a and p. In fact, the 
number of steps to compute (2) is more or less equal to the number of digits in p, 
So it is possible to evaluate Legendre symbols for numbers with hundreds of digits. 
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We won’t spend the time to do an example that is that large, but are content with 
the following modest example. 


(Ss) (=~) 
--(3) Eh) ~@) 
-E)-O6)~ 


Hence, 37603 is a quadratic residue modulo 48611. 

The hardest part of computing | lies not in the use of the Law of Quadratic 
Reciprocity, but rather in the necessity of factoring the number a before applying 
the law. Thus, in our example, it takes some work to recognize that 37603 factors as 
31 - 1213, and if a has hundreds of digits, it may be virtually impossible to factor a. 
Surprisingly, it is possible to evaluate O without doing any difficult factorizations. 
The idea is to use the Law of Quadratic Reciprocity to flip the Legendre symbol (5) 
for any positive odd value of a, completely ignoring the question of whether a is 
prime. As usual, if both a and p are congruent to 3 modulo 4, then you must put in 
a minus sign. More generally, we can assign a value to the Legendre symbol (¢) 
for any integers a and b provided that b is positive and odd. (This generalized 
Legendre symbol is called a Jacobi symbol.) This is done by first factoring b into a 


product of primes, b = p,p2---~p,, and then defining (¢) as a product of Legendre 


™ §-@@-@ 


We can evaluate the Legendre or Jacobi symbol by repeatedly applying the 
following Generalized Law of Quadratic Reciprocity. 


Theorem 22.2 (Generalized Law of Quadratic Reciprocity). Let a and b be odd 
positive integers. 


(=) = 1 ifb=1 (mod 4), 
b/)/ )-1 ifb=3 (mod 4). 


(5) - 1 ifb=1or7 (mod 8), 
~ )-1 ifb=30r5 (mod 8). 
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if a = 1 (mod 4) orb = 1 (mod 4), 


EN 
oe) 
ae 
lI 
ea|(ooao|o 


ifa = b=3 (mod 4). 


Amazingly enough, if you use these rules, the multiplication formula (“e2) = 
(51) (2), and the fact that (¢) only depends on the value of a modulo b, you'll 
end up with the correct value for the Legendre symbol. The only caveat, and it is 
extremely important, is that you’re only allowed to flip (¢) for odd positive values 
of a. If a is even, then you must first factor off a power of (2), and if it is negative, 
then you must factor off the (=H). 

We illustrate this new and improved Quadratic Reciprocity Law by recomput- 
ing our earlier example. 


37603\ _— (48611\ ——/11008\ —/2-43\ ss /_ 48 
48611)  \37603/ \37603/ #£\37603/  \37603 


- (Fr)= @) = Gi) = @) =? 


Although this may not look much shorter than before, it actually required much 
less work, because we didn’t need to find the prime factorization of 37603. 

We have just verified that 37603 is a quadratic residue modulo 48611, so the 
congruence 


az” = 37603 (mod 48611) 


has a solution (in fact, two solutions). Unfortunately, nothing we have done helps 
us to find the solutions, which turn out to be 


« = 17173 (mod 48611) and 2x = 31438 (mod 48611). 


However, there do exist more advanced methods that actually solve the congruence 
x* = a(mod p). And for certain special sorts of primes, it is possible to write 
down the solutions explicitly; see Exercises 22.7 and 22.8. 

We conclude this section by proving the first part of the Generalized Law of 
Quadratic Reciprocity (Theorem 22.2), and we leave the second and third parts for 
you to do (Exercise 22.6). So we are given an odd positive integer b and we want 
to compute (=). When we factor b as a product of primes, some of the factors are 
congruent to 1 modulo 4 and some of them are congruent to 3 modulo 4, say 


b = pipe ++: Prdiq2 ++ Ys with p; = 1 (mod 4) and g; = 3 (mod 4). 


We observe that 
i 1 (mod 4) _ if s is even, 
~ |3 (mod 4) if s is odd. 
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From the definition of the Jacobi symbol we have 


Oboe igicy, 


The original version of Quadratic Reciprocity (Theorem 22.1) says that lan i 


and =) = —1,s0 
(=) sepa 1 if s is even, 
b} - )-1~ ifs is odd. 


Comparing this with our earlier description of b (mod 4), we have proven that 


b=1(mod4) <> siseven <> (=) = 


b=3(mod4) <= > sisodd <> (=) =-—1. 
This is the desired result. 


Exercises 


22.1. Use the Law of Quadratic Reciprocity to compute the following Legendre symbols. 
85 29 101 31706 
: b —— d eae 
(a) (Ta) (b) (=r) ©) (sear) ¢) (Fas) 


22.2. Does the congruence 


x? — 32 — 1 = 0 (mod 31957) 


have any solutions? [Hint. Use the quadratic formula to find out what number you need to 
take the square root of modulo the prime 31957. ] 


22.3. Show that there are infinitely many primes congruent to 1 modulo 3. [Hint. See the 
proof of the “1 (Modulo 4) Theorem” in Chapter 21, use A = (2p; p2 ---p,)? + 3, and try 
to pick out a good prime dividing A.] 


22.4. Let p be a prime number (p # 2 and p # 5), and let A be some given number. 
Suppose that p divides the number A? — 5. Show that p must be congruent to either 1 or 4 
modulo 5. 


22.5. == Write a program that uses Quadratic Reciprocity to compute the Legendre sym- 


bol (2) or, more generally, the Jacobi symbol ($). 
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22.6. (a) Prove the second part of the Generalized Law of Quadratic Reciprocity (Theo- 
rem 22.2); that is, prove that (2) equals 1 if b = 1 or 7 modulo 8 and equals —1 if 


b= 3or5 modulo 8. 


(b) Prove the third part of the Generalized Law of Quadratic Reciprocity (Theorem 22.2); 
that is, prove that (#) equals (2) if a or b is congruent to 1 modulo 4 and equals — (2) 


if both a and b are congruent to 3 modulo 4. 
22.7. Let p be a prime satisfying p = 3 (mod 4) and suppose that a is a quadratic residue 


modulo p. 
(a) Show that x = a(t+1)/4 is a solution to the congruence 


z* =a (mod p). 


This gives an explicit way to find square roots modulo p for primes congruent to 3 
modulo 4. 

(b) Find a solution to the congruence x? = 7 (mod 787). (Your answer should lie be- 
tween 1 and 786.) 


22.8. Let p be a prime satisfying p = 5 (mod 8) and suppose that a is a quadratic residue 


modulo p. 
(a) Show that one of the values 


is a solution to the congruence 
az? =a (mod p). 


This gives an explicit way to find square roots modulo p for primes congruent to 5 
modulo 8. 

(b) Find a solution to the congruence x? = 5 (mod 541). (Give an answer lying be- 
tween | and 540.) 

(c) Find a solution to the congruence x? = 13 (mod 653). (Give an answer lying be- 
tween 1 and 652.) 


22.9. 2 Let p bea prime that is congruent to 5 modulo 8. Write a program to solve the 


congruence 


x? =a (mod p) 


using the method described in the previous exercise and successive squaring. The output 
should be a solution satisfying 0 < x < p. Be sure to check that a is a quadratic residue, 
and return an error message if it is not. Use your program to solve the congruences 


x? =17(mod 1021), 2?=23(mod 1021), 2? =31 (mod 1021). 


22.10. If a”~! 4 1 (mod m), then Fermat’s Little Theorem tells us that m is composite. 


On the other hand, even if 
a™—! =1 (mod m) 
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for some (or all) a’s satisfying gcd(a,m) = 1, we cannot conclude that m is prime. This 
exercise describes a way to use Quadratic Reciprocity to check if a number is probably 
prime. (You might compare this method with the Rabin—Miller test described in Chap- 
ter 19.) 

(a) Euler’s criterion says that if p is prime then 


ge = (<) (mod p). 


Use successive squaring to compute 119°4 (mod 1729) and use Quadratic Recipro- 
city to compute (+755) Do they agree? What can you conclude concerning the 
possible primality of 1729? 

(b) Use successive squaring to compute the quantities 


giteess3t= 1/2 (mod. 129383%)¢ Nand! 21292938 (aod 1293837). 


What can you conclude concerning the possible primality of 1293337? 


Chapter 23 


Proof of Quadratic Reciprocity 


The Law of Quadratic Reciprocity (Theorem 22.1) has three parts. The first part 
tells us when —1 is a square modulo p, the second part tells us when 2 is a square 
modulo p, and the third part relates p being a square modulo q to q being a square 
modulo p. We already proved the first two parts of quadratic reciprocity in Chap- 
ter 21. The third part, which says that 


for odd primes p and q, is more difficult to prove. In this chapter we build upon 
the ideas used to prove the second part to give Eisenstein’s proof of the third part. 
However, if you want to skip the proof for now, you can proceed to the next chap- 
ter and return here whenever you feel ready to complete the proof of Quadratic 
Reciprocity. 

Eisenstein’s proof uses a criterion of Gauss that we already discussed in Chap- 
ter 21 when we computed (): Let p be an odd prime, let a be any integer not 
divisible by p, and for convenience, let 

p—l 
i om. 
We consider the list of numbers 
Os, 20, OG; oka bh Os 


and we reduce them modulo p into the range from —P to P. Some of the reduced 
values will be positive and some of them will be negative. Let 


number of integers in the list a, 2a,3a,..., Pa that be- 
ju(a, p) = | come negative when the integers in the list are reduced 
modulo p into the interval from —P to P 


[Chap. 23] Proof of Quadratic Reciprocity Ie? 


We illustrate by computing Gauss’s js value for p = 13 anda = 7,so P = 
to = 6. We start with the six numbers 


Three of the residues are negative, so (7, 13) = 3. 

Gauss’s Criterion, which we now state and prove, says that j(a, p) can be used 
to determine if a is a square modulo p. (We proved Gauss’s Criterion for a = 2 in 
Chapter 21.) 


Theorem 23.1 (Gauss’s Criterion). Let p be an odd prime, let a be an integer that is 
not divisible by p, and let (a, p) be the number given by the formula on page 171. 


Then 
a 
2p ena om 
(<) ey 


Before starting the proof of Gauss’s Criterion, we first prove a lemma that 


describes what happens when we reduce a, 2a, 3a,..., Pa modulo p. 
Lemma 23.2. When the numbers a, 2a, 3a,..., Pa are reduced modulo p into the 
range from —P to P, the reduced values are +1,...,+P in some order, with each 


number appearing once with either a plus sign or a minus sign. 
Proof of Lemma 23.2. We write each multiple ka as 
ka = pg + rk with —-P <r, < P. 


Suppose that two of the r; values are either the same or negatives of each another, 
say Tr; = er; with e = +1. Then 


ia — eja = (pqi + ri) — e(pqj +175) = (Gi — €g;), 


so p divides (7 — e7)a. But p is prime and a is not divisible by p, so we conclude 
that p divides 7 — e7. However, 


|jé-—e7| < |t]} + ej] =t +75 P+P=p-1. 


[Chap. 23] Proof of Quadratic Reciprocity 1S 


So the only way for 2 — e7 to be divisible by p is to have 1 — e7 = 0. Since e = +1 
and 7 and 7 are positive, it follows that 1 = 7. 

We have thus shown that the numbers r1,7r2,...,7p are all different, even if 
we change their signs. Since they are all between —P and P, and none of them is 
zero, it follows that each of the numbers 1, 2,..., P, with either a plus or a minus 
sign, appears exactly once in the list of numbers 71, 7r2,..., rp. led 


We now use the lemma to prove Gauss’s Criterion. Later in this section the 
lemma will be used a second time to prove an important formula needed for Eisen- 
stein’s proof of Quadratic Reciprocity. 


Proof of Gauss’s Criterion (Theorem 23.1). We start by taking the list of numbers 
a, 2a,...,Paand multiplying them. The product equals 


a 20> 800°) Pa So (152382 P)S@" «PI. (x) 
On the other hand, Lemma 23.2 tells us that 
a+ 2a+3a-++Pae = (+1) + (+2) «(+3)---(4P) (med 9), 


where the number of minus signs is ju(a,p), because that’s exactly how j(a, p) is 
defined. So 


a: 2a-3a-+:Pa = (-1)¥*?) .1.2-3--- P (mod p) 
= (—1)"(?) P! (mod p). («*) 
Comparing the formula (*) to the congruence (**), we see that 
a? P! = (—1)@?) P! (mod p). 


The number P! is not divisible by p, so we may cancel it from both sides of the 
congruence to get 
aP = (—1)#(%?) (mod p). 


Finally, we observe that Euler’s Criterion (Theorem 21.1) says that 


a= (2) (mod p), 


(remember that P = Pp), sO 


(2) = (—1)#(@P) (mod p). 
p 
This says that (2) — (—1)#(%?) is divisible by p. But the quantity (2) — (—1)#(”) 


equals either —2, 0, or 2, while p > 3, so we conclude that (5) — 1)ee2) == (); 
This completes the proof of Gauss’s Criterion. 
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Eisenstein’s proof of Quadratic Reciprocity employs a useful gadget called the 
floor function or greatest integer function. For a real number t, it is denoted |t| 
and is defined by 


|t| = (largest integer n satisfying n < ft). 
For example, 


22 


Eat ee eee eee F] =3 |4| =4. 


We now prove an identity of Eisenstein that lies at the heart of his proof of quadratic 
identity. 


Lemma 23.3. Let p be an odd prime, let P = p let a be an odd integer that is 
not divisible by p, and let (a, p) be the quantity defined on page 171 that appears 
in Gauss’s criterion. Then 


P 


Y |=] = (asp) (moa 2) 


k=1 


Before proving the lemma, we illustrate it with a = 7 and p = 13. Then 
P = 6, so the sum is 


oy ag ees ee a 
reg ~ 135 13 13 13 13 13 
=) ee a DO 8 


= 9. 


Earlier in this chapter we computed (7, 13) = 3. Notice that the sum is not equal 
to (7, 13), but they are congruent modulo 2, since they are both odd. 


Proof of Lemma 23.3. Just as in the proof of Lemma 23.2, we write each multi- 
ple ka as 
ka=qept+r, with-P <r, < P. 


We next divide by p to obtain 


k 1 1 
ages Wine ee 
p p 7 aes alae 


Taking the floor of both sides, we see that 


=| > hae Levys OU; 
Dall slit. APO. 
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So if we add the values | **| fork = 1,2,..., P, we get 


P 
3 =| 2 ; eae of k such mt) 
all kT . 3 
pee az ry 1S negative 
P 
=> ax — H(a,p). (t) 
k=1 


Our final task is to compute the sum q; +---+ q%, but we only need to compute 
it modulo 2. Note that if we reduce the formula 


ka = qkp + Tk 
modulo 2 and use the fact that both a and p are odd, we get 
k=qe~+rpz (mod 2). 


Summing, we find that 


P P P 
Sok = oat doe (mod 2). (t) 
k=1 k=1 k=1 
However, Lemma 23.2 tells us that the numbers r1,7r2,...,7p are equal to the 
numbers +1,...,+P in some order, with each number from 1 to P appearing 


once with either a plus sign or a minus sign. Since we are working modulo 2, the 
signs are irrelevant, so we see that 


P 
So rg =14+24---+P (mod 2). 
k=1 


In other words, the sums 5° & and 4°, appearing in the congruence ({) are con- 
gruent modulo 2, so we conclude that 


P. 


So ak = 0 (mod 2). 


k=1 
Now reducing equation (}) modulo 2, we find that 
Sea i P 
> =| = So a — u(a,p) = u(a,p) (mod 2), 
hai LP k=1 


which is exactly what we are trying to prove. L] 
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Proof of Quadratic Reciprocity. We now have all of the tools that we need to prove 
Quadratic Reciprocity. The proof uses some geometry. Let p and q be odd primes, 


and let 1 
ia 
d ap SER) 
and Q 5 


We consider the triangle T'(q,p) in the xy-plane whose vertices are the points 
(0,0), (§,0), and (4, 4), as illustrated in Figure 23.1. We are going to count the 
number of points with integer coordinates that lie inside the triangle Tq, p). For 
example, the triangle in Figure 23.1 contains 19 integer points, since we only count 
the points that are strictly inside the triangle, not the ones lying on the bottom line 


segment. 
(5,4) 


p= 


a 


Figure 23.1: The Triangle T(q, p) 


We count the integer points in T'(q, p) by counting the number of such points 
with x = 1, then the number with x = 2, and so on. The hypotenuse of the 
triangle T'(q, p) lies on the line 


so for x = 1 we get Fa points, for x = 2 we get | 22 points, and so on. In other 


words, 
number of points with integer \ _ 3 kq 
coordinates in triangle T'(q,p)] i |p : 


The left-hand triangle in Figure 23.2 illustrates this formula with g = 7 and 
p = 13. Counting the integer points in each column of the picture, the number of 
integer points in triangle T(7, 13) is 


7 14 21 28 35 42 
— — — — — — | = 2 z = os 
Sl+lsl+lSl+lBl+lBl+ ls 0+14+14+24+24+3=9 
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(0,0) (3,0) (0, 0) 
The triangle 7'(7, 13) The triangle 7’ (13, 7) 


Figure 23.2: Counting Integer Points in Triangles 


Returning now to the general case, we next count the number of integer points 
in the triangle T’(p,q) whose vertices are (0,0), (0,4), and (4,5). The triangle 
T’(13, 7) is illustrated in the right-hand side of Figure 23.2. We count the number 
of integer points in T’(p,q) by counting them horizontally, so first we count the 
number of points with y = 1, then the number of points with y = 2, and so on. 


Repeating our earlier argument, we find that 


ea of points with integer ) = 3 kp | 


coordinates in triangle T’(p, q) rears AX! 


For example, the number of integer points in the triangle T’(13, 7) illustrated in 
Figure 23.2, counted horizontally row-by-row, is 


13 26 39 
= = =|] =14+34+5=9. 
Pl +Le]+be]aats 
Comparing the formulas for the number of integer points in the triangles T(q, p) 
and T’(p, q) to the formula in Lemma 23.3, we find that 


number of number of Q P 

i : ; ; kp kq 
integer points } + [| integer points | = S + S — 
in T’ (p,q) in T(q, p) Pa oo ee 4 


= (p,q) + u(@, p) (mod 2). 


The end of the proof is simple, but clever. Consider the two triangles T(q, p) 
and T’(p,q). They fit together to form a rectangle with vertices (0,0), (5,0), 
(0, $), and (4, 4), as illustrated in Figure 23.3. The number of integer points in 
the rectangle is easy to compute. The rectangle contains |5| columns of integer 
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(5,9) 
(5,0) 
Figure 23.3: Forming a rectangle from T'(q, p) and T’(p, q) 
points, and each column contains | $| integer points, so 
number of number of number of integer points 
integer points | + | integer points | = | in the rectangle with vertices 
in T"(p, q) in T(q,P) (0,0), (§,0), (0, 3), and (5, 3) 
= |= | |= | 
Z 2 
ee ee 
<2 2 


[We should also note that the only integer point on the diagonal of the rectangle is 
the point (0, 0), since points on the diagonal lie on the line y = fa, and the integer 
points on this line all have the form (kp, kq) for some integer k.] 

Combining this formula with our earlier formula for the sum of the number of 
integer points in the two triangles, we see that 


| oe OE ao 
H(q, P) + WP, 4) = i (mod 2). 


All that remains is to use Gauss’s Criterion (Theorem 23.1) to compute 

P\ (4 p=1.q=1 
(”) (*) = (—1)#(9) . (—1)H#@P) — (—1)H(P.2) +H(a) =(-1)3° 3. 
This completes our proof of the third part of Quadratic Reciprocity. L] 


Exercises 


23.1. Compute the following values. 


@ |-7] @ v3) @lr| @ a 
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23.2. This exercise asks you to explore some properties of the function 


f(x) = [2a] — 2[2], 


where z is allowed to be any real number. 
(a) If n is an integer, how are the values of f(x) and f(x + n) related? 
(b) Compute the value of f(x) for several values of x between 0 and 1 and make a 
conjecture about the value of f(z). 
(c) Prove your conjecture from (b). 


23.3. This exercise asks you to explore some properties of the function 


o(a) = |e] + [2+ 5), 


where z is allowed to be any real number. 
(a) Compute the following values of g(x): 


g(0), (0.25), (0.5), g(1), g9(2), 9(2.5), (2.499). 


(b) Using your results from (a), make a conjecture that g(x) = |ka| for a particular 
value of k. 


(c) Prove that your conjecture in (b) is correct. 
(d) Find and prove a formula for the function 


oe) =le]+]o+5|+]e+5 |. 


(e) More generally, fix an integer N > 1 and find and prove a formula for the function 


uci e etl | 


23.4. Let p be an odd prime, let P = pet, and let a be an even integer that is not divisible 


by p. 
(a) Show that 


> Fd : mS ~ + u(a,p) (mod 2). 


[Hint. When a is odd, we proved a similar congruence in Lemma 23.3.] 
(b) In particular, take a = 2 and use (a) and Gauss’s Criterion (Theorem 23.1) to show 


that ; 
a) PSN PERG I ees) 
eo) 


[Hint. What is the value of |2k/p| when 1 < k < P?] 
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23.5. Let a and b be positive integers and let T be the triangle whose vertices are (0,0), 
(a, 0), and (a, b). Consider the following three quantities: 


A = the area inside the triangle T, 
N = the number of integer points strictly inside the triangle 7’, 


B = the number of integer points on the edges of the triangle 7’. 


For example, if a = 6 and b = 2, then we have the picture 


SO F 9 
A=—— =6, N=2, B=10. 


(a) Draw a picture for the case that a = 5 and b = 3, and use it to compute the values of 
A, N, and B. Then compute A — N — $B. 


(b) Repeat (a) with a = 6 and b = 4. 
(c) Based on your data from (a) and (b), make a conjecture relating A, NV, and B. 


(d) Prove that your conjecture is correct. [Hint. Use two copies of the triangle to form a 
rectangle. ] 


Chapter 24 


Which Primes Are Sums 
of Two Squares? 


Although our exploration of congruences has been interesting and fun, there is no 
doubt that the fundamental questions in number theory are questions about actual 
natural numbers. A congruence 


A= B (mod M) 


is all well and good; it tells you that the difference A — B is a multiple of /, but 
it can’t compare to an actual equality 


AS: 


One way to think of congruences is that they are approximations to true equalities. 
Such approximations are not to be despised. They have a certain intrinsic interest 
of their own and, furthermore, they can often be used as tools to construct true 
equalities. This is the path we take in this chapter, where we use the Law of Qua- 
dratic Reciprocity, which is a theorem about congruences, as a tool to construct 
equalities between whole numbers. 

The question we address is as follows: 


Which numbers can be written as sums of two squares? 


For example, 5, 10, and 65 are sums of two squares, since 
5 = 27417, 10 = 37+ 17, and 65=77+4 47. 


On the other hand, the numbers 3, 19, and 154 cannot be written as sums of two 
squares. To see this for 19, for example, we just need to check that none of the 
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differences 
19-17=18, 19-27=15, 19-3%=10, or 19-42=3 


is a square. In general, to check if a given number m™ is a sum of two squares, list 
the numbers 


m — 0°, m1, m—2, "i—2*. ee ee 


until either you get a square or the numbers become negative.! 
As usual, we begin with a short table and look for patterns. 


1 =124+02 11 NO 21 NO 31 NO Al = 42452 
QoS 12 SE 1 12 NO 22 NO 32 = 42442 42 NO 
3 NO 13 = 94 4.3% 93 NO 33 NO 43 NO 
4 =024+2? 14 NO 24 NO 34 = 37452 44 NO 
5 =124+22 15 NO 05 S37 4s. «35: (NO 45 = 374 6? 
6 NO 16: =07 4" 9671745" 36 = 0" +6" 46 -NO 
7 NO 17 =1?+-4* 27° NO 37 =12+6? 47 NO 
8 =2?4+22 18 =32+432 28 NO 38 NO 48 NO 
9 =07+32 19 NO 29 =27452 39 NO 49 =074+7? 
10 =12+32 20 =22+44? 30 NO AQ. = 97 +6? 50 = 52+ 5? 


Numbers That Are Sums of Two Squares 


From the table, we make a list of the numbers that are and are not sums of two 
squares. 


Numbers that are | 1, 2, 4,5, 8,9, 10, 13, 16, 17, 18, 20, 25, 26, 
sums of two squares | 29, 32, 34, 36, 37, 40, 41, 45, 49, 50 


Numbers that are not | 3, 6, 7, 11, 12, 14, 15, 19, 21, 22, 23, 24, 27, 
sums of two squares | 28, 30, 31, 33, 35, 38, 39, 42, 43, 44, 46, 47, 48 


Can you spot any patterns? 

One immediate observation is that no number that is congruent to 3 modulo 4 
can be written as a sum of two squares. Looking back at the first two columns of the 
table, we might have also guessed that ifm = 1 (mod 4) then m is a sum of two 
squares. But this guess is not correct, since 21 is not a sum of two squares. Another 
exception is 33. However, both 21 and 33 are composite numbers, 21 = 3 - 7 and 
33 = 3-11. If we only look at prime numbers, we see that every prime in our table 
satisfying 

p = 1 (mod 4) 


' Actually, it’s only necessary to check if m — a? is a square for all a’s between 0 and ,/m/2. 
Do you see why this is enough? 
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is indeed a sum of two squares. This observation reminds us of the “prime di- 
rective” in number theoretic investigations: always start by investigating prime 
numbers. There are two reasons to do this. First, patterns are usually easier to spot 
for primes. Second, patterns for primes can often be used to deduce patterns for all 
numbers, since the Fundamental Theorem of Arithmetic (Chapter 7) says that the 
primes are the basic building blocks of all numbers. 

Now that we’ve decided to concentrate on primes, let’s compile a more exten- 
sive list of primes and see which can be written as sums of two squares. 


2 =1%24+12 31 NO 73 =324+82 127 NO 179 NO 
3 NO 37 =174+67 79 NO 131 NO 181 = 92 +102 
5 S14 40?) AL S42 68> NO 137 =4?+4+112 191 NO 
7 NO 43 NO 89 S52 +8" (139: “NO 193 = 77412? 
11 NO 47 NO 97 =42+92 149 =7724+10? 197 =124 14? 
1397 432 5H a AT 02 151 NO 199 NO 
17 =17+42 59 NO 103 NO 157 =624+112 211 NO 
19 NO 61 =57+6? 107 NO 163 NO 223 NO 
23 NO 67 NO 109 = 37+ 10? 167 NO 227 NO 
29 = 22452 71 NO 113 =724+82 173 =224132 229 =2?+415? 


Primes That Are Sums of Two Squares 


This gives the following two lists. 


Primes that are 2,55 13¢ W729, 37, 41, 53, GL 7 3 BIRO Te 1015 
sums of two squares | 109, 113, 137, 149, 157, 173, 181, 193, 197, 229 


Primes that are $2 7, 9.23. 31 As Ag 59. Oy. 71, 7983: 
not sums of 103; 107,127, 13.1,.139- 131,163,167, 179, 191. 
two squares 199-2112 235.227 


The right conjecture is obvious. Primes that are congruent to 1 modulo 4 seem to 
be sums of two squares, and primes that are congruent to 3 modulo 4 seem not to 
be. (We’re ignoring 2, which is a sum of two squares, but occupies a somewhat 
anomalous position.) The rest of this chapter is devoted to a discussion and proof 
of this conjecture. 


Theorem 24.1 (Sum of Two Squares Theorem for Primes). Let p be a prime. Then 
pis a sum of two squares exactly when 


p = 1 (mod 4) (Or p=): 


The Sum of Two Squares Theorem really consists of two statements. 
Statement 1. If p is a sum of two squares, then p = 1 (mod 4). 


Statement 2. If p = 1 (mod 4), then p is a sum of two squares. 
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One of these statements is fairly easy to verify, while the other is quite difficult. 
Can you guess which is which without actually trying to prove either of them? 
This is not an idle or frivolous question. Before trying to verify a mathematical 
statement, it helps to have some idea of how difficult the proof is likely to be, or, 
as a mathematician would say, to know the depth of the statement. The proof of a 
deep theorem is likely to require stronger tools and more effort than the proof of 
a “shallower” theorem, just as it requires specialized machinery and great effort to 
build a skyscraper, while hammer and nails suffice to construct a birdhouse. 

So my question to you is “Which of the statements 1 and 2 is deeper?” Intu- 
itively, a statement is deep if it starts with an easy assertion and uses it to prove a 
difficult assertion. Statements 1 and 2 deal with the following two assertions: 


Assertion A. p is a sum of two squares. 
Assertion B. p = 1 (mod 4). 


Clearly, B is an easy assertion since for any given prime number p, it is easy to 
check whether it is true. Assertion A, on the other hand, is more difficult, since it 
can take a lot of work to check whether a given prime p is a sum of two squares. 
Thus, statement 1 says that if the deep assertion A is true, then so is the easy asser- 
tion B. This suggests that statement 1 won’t be too difficult to prove. Statement 2 
says that if the easy assertion B is true, then the deep assertion A is also true. This 
suggests that a proof of statement 2 is likely to be difficult. 

Now that we know that statement 1 should be easy to prove, let’s prove it. We 
are told that the prime p is a sum of two squares, say 


p=a? +b’. 


We also know that p is odd, so one of a and b must be odd and the other one must 
be even. Switching them if necessary, we may assume that a is odd and 6 is even, 
say 

a=2n+1 and b= 2m. 


Then 
p=a*4+ 0% = (2n4+1)* 4+ (2m)? = 4n? + 4n+1+ 4m? = 1 (mod 4), 


which is exactly what we were trying to prove. 

Having given this very easy proof of statement 1, I want to show you a more 
complicated proof. Why would we ever want to use a complicated proof in place 
of an easy one? One answer is that frequently the more complicated argument can 
be applied in situations where the simple ideas do not work. 
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Our easy proof was to take the given formula p = a” + b?, reduce it modulo 4, 
and deduce something about p modulo 4. That’s a very natural way to proceed. For 
our new proof, we reduce the formula modulo p. This gives 


0 =a* +0? (mod p), SO — a? = b? (mod p). 


Next we take the Legendre symbol of both sides. 
Ge 
p p 
SG) -@) 
P/\P p 
G1 
p 


Thus, —1 is a quadratic residue modulo p, so the Law of Quadratic Reciprocity 
(Chapter 22) tells us that p = 1 (mod 4). This second proof is especially amusing 
because we reduce modulo p to get information modulo 4. 

The proof of statement 2, that every prime p = 1 (mod 4) can be written as 
a sum of two squares, is more difficult. The proof we give is based on Fermat’s 
famous Method of Descent and in this form is essentially due to Euler. We start by 
describing the basic idea of Fermat’s descent method, since once you understand 
the concept, the details become much less fearsome. 

We assume that p = 1 (mod 4), and we want to write p as a sum of two 
squares. Rather than immediately trying to write p = a? + b?, let’s tackle the less 
onerous task of writing some multiple of p as a sum of two squares. For example, 
Quadratic Reciprocity tells us that z? = —1 (mod p) has a solution, say x = A, 
and then A? + 1? is a multiple of p. So we begin with the knowledge that 


A? + B? = Mp 


for some integers A, B, and M. If M = 1, then we’re done, so we suppose that 
M > 2: 

Fermat’s brilliant idea is to use the numbers A, B, and M to find new inte- 
gers a, b, and m with 


a? +b? = mp and m<M -1. 


Of course, if m = 1, then we’re done. And if m > 2, then we can apply Fermat’s 
Descent Procedure again starting with a, b, and m to find a yet smaller multiple 
of p that is a sum of two squares. Continuing repeatedly in this fashion, we must 
eventually end up with p itself written as a sum of two squares. 
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This description has omitted one “minor” detail: how to use the known num- 
bers A, B, and M to produce the new numbers a, b, and m. Before describing 
this crucial piece of the proof, we briefly digress to look at a beautiful (and useful) 
identity. 

The identity says that if two numbers that are sums of two squares are multi- 
plied together, then the product is also a sum of two squares. 


(u2 + v?)(A? + B?) = (uA+ vB)* + (vA — uB)?. 


There is no difficulty in verifying that this identity is correct once it has been written 
down. (Discovering it in the first place is another matter, which we discuss at the 
end of this chapter.) Thus, multiplying out the right-hand side, we find that 


(uA + vB)? + (vA — uB)? 
= (u?A? + 2uAvB + v*B?) + (v2 A* — 2uAuB + u*B?) 
= uA? + y*B? + v7? A? + u?B? 
= (u* + v*)(A? + B?). 
We are now ready to describe Fermat’s Descent Procedure for writing any 


prime 

p = 1 (mod 4) 
as a sum of two squares. As explained above, the idea is to begin with some 
multiple /p that is a sum of two squares and, by some clever manipulations, find 


a smaller multiple that is also a sum of two squares. To help you understand the 
various steps, we do the example 


a? + b? = 881 


side by side with the general procedure. The Descent Procedure, in all its glory, is 
on display in the table on page 187. Be sure to go over the procedure step by step 
before proceeding with the text. 

The Descent Procedure described on page 187 reduced the initial equation 


3872 + 12 = 170-881 
to the smaller multiple 
LOT 22" = 135887 


of 881. To complete the task of writing 881 as a sum of two squares, we repeat the 


3872 + 12 = 170- 881 


Multiply to get 
(47? + 17)(387? + 1?) 
= 1707 - 13-881 
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p = 881 p any prime = 1 (mod 4) 
Write Write 
3872 + 12 = 170- 881 A? +B? = Mp 
with 170 < 881 with M <p 
Choose numbers with Choose numbers wu and v with 
47 = 387 (mod 170) u =A (mod M) 
1 = 1 (mod 170) v = B (mod M) 
~1@ <47,1< 2 —~3M <u,v<5M 
Observe that Observe that 
AP = 3872 u? + v2 = A? 4+ B? 
= 0 (mod 170) =0 (mod M) 
So we can write So we can write 
2 2 
9 2 ; uc +u-= Mr 
Ay? 1 1702 13 A? +B? =Mp 


(for some 1<r< M) 
Multiply to get 
(u? + v?)(A? + B?) = M*rp 


Use the identity (u? + v?)(A? + B*) = (uA+ vB)? + (vA — uB)?. 


(47-387 +1-1)? + (1-387 — 47-1)? 
= 170? -13- 881 
181907 + 340? = 170?- 13-881 
SS SN 
each divisible by 170 


Divide by 1707. 
2 2 
18190 340 
ens |) IB BeL 
( 170 ) v (Fa) 2 
1072 + 2? = 13- 881 


This gives a smaller multiple of 881 
written as a sum of two squares. 


(uA + vB) +(vA— uB)? = M?rp 
=’_——vw_ Or" —errvl'YY 


each divisible by M 
Divide by M?. 
uA+vB\" ss vA—uB\? 
iit ata Baal a, | Ee 
M M 4 


This gives a smaller multiple of p 
written as a sum of two squares. 


Repeat the process until p itself is written as a sum of two squares. 


[Chap. 24] Which Primes Are Sums of Two Squares? 188 


Descent Procedure starting with the equation 107? + 2? = 13 - 881. This gives 


p= 881 pany prime = 1 (mod 4) 
1072 + 2? = 13-881 A? + B2 = Mp 
3 = 107 (mod 13) u =A (mod M) 
2 = 2 (mod 13) v = B (mod M) 
S028 = AG uw +v? = Mr 
(3? + 27)(107? + 27) = 137 - 1-881 (u? + v?)(A? + B?) = M?rp 


Use the identity (u? + v?)(A? + B?) = (uA+ vB)? + (vA — uB)?. 
(3-107 + 2-2)? + (2-107 — 3-2)? (uA + vB)? + (vA-—uB)? = M?rp 


= 137. 881 
3252 + 208? = 13? - 881 
Divide by 137. Divide by M?. 
A A-—uB\’ 
252 + 162 = 881 (4S) = (A) = rp 


This second application of the Descent Procedure has given us the solution to 
our original problem, 
881 = 257 + 167. 


Of course, for a small number such as 881 it might have been easier to solve 881 = 
a? + b* by trial and error, but as soon as p becomes large, the Descent Procedure 
is definitely more efficient. In fact, each time the Descent Procedure is applied, the 
multiple of p is at least cut in half. 

To show that the Descent Procedure actually works, we need to verify five 
assertions. At the first step we need to find numbers A and B with 


G) ACA B= Mie and Mi <4, 
To do this, we take a solution z to the congruence 
a? = —1 (mod p) 


with 1 < x < p. Quadratic Reciprocity tells us that there is a solution,” since we 
are assuming that p = 1 (mod 4). Then setting A = x and B = 1, we see that 
A? + B? is divisible by p. Furthermore, 


A* + B* — 1)*+ 1° = 


= <p 
Pp Pp 


*In practice, an easy way to solve x? = —1 (mod p) is to compute b = a'?~)/4 (mod p) for 
some randomly chosen values of a. Euler’s formula (Chapter 21) tells us that b? = (2) (mod p), so 


each choice of a gives us a 50% chance of winning. 
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In the second step of the Descent Procedure we chose numbers u and v satis- 
fying 


u=A(modM), v=B(modM), and — 
We then observed that 
u2+v? = A*+ B* =0 (mod M), 

so u? + v? is divisible by M, say u? + v2? = Mr. The remaining four statements 
we need to check are as follows: 

Gb) 2 

Gir <M 

(iv) uA + vB is divisible by M. 

(v) vA — uB is divisible by M. 
We check them in reverse order. To verify (v) we compute 

vA—-uB=B-.A-—A-B=0 (mod M). 
Similarly, for (iv) we have 
uA+vB=A-A+B-B=Mp=0 (mod M). 

For (iii) we use the fact that u and v are between —M /2 and M/2 to estimate 


2 2 2 2 
ee + < (M/2)* + (M/2) ys 
M M 2 
Notice that this actually shows that r < M/2, so every time the Descent Procedure 
is used, the multiple of p is at least cut in half. 

Finally, to show that (ii) is true, we need to check that r 4 0. So we assume 
that r = 0 and see what happens. Well, if r = 0, then u2+v? = 0, so we must have 
u=v=0. Butu = A (mod M) and v = B (mod M), so A and B are divisible 
by M. This implies that A? + B? is divisible by M?. But A? + B? = Mp, so we 
see that IZ must divide the prime p. We also know that M < p, so it must be true 
that M = 1. This means that A? + B? = p and we’re already done writing p as a 
sum of two squares! Thus, either (ii) is true, or else we already had A? + B? = p 
and there was no reason to use the Descent Procedure in the first place. 

This completes the proof that the Descent Procedure always works, so we have 
now finished proving both parts of the Sum of Two Squares Theorem for Primes. 
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Digression on Sums of Squares and Complex Numbers 
The identity 
(u2 + y*)(A? + B?) = (uA + vB)? + (vA — uB)’, (x) 


which expresses the product of sums of two squares as a sum of two squares, has 
been very useful, and we will find further uses for it in the next chapter. You may 
have wondered from whence this identity comes. The answer lies in the realm of 
complex numbers, that is, numbers of the form 


z=a2+1y, 


where 2 is a square root of —1. Two complex numbers can be multiplied together 
in the usual way as long as you remember to replace i? by —1. Thus, 


(v1 le ty1)(x2 ae iy2) = 41% + 121 Y2 + 1y1 x2 + i y1ye 
= (102 — yiy2) + i(t1y2 + y122). 


Complex numbers also have absolute values, 


|z| = |x + iy| = Va? + y?. 


This formula comes from viewing the number z = x + ty as corresponding to the 
point (x, y) in the plane, and then |z| is just the distance from z to the origin (0, 0). 
The identity (*) now comes from the following fact: 


The absolute value of a product is the product of the absolute values. 


In other words, |21z2| = |21| - |z2|. Writing this out in terms of x’s and y’s gives 


(a1 + tyr) (v2 + iye)| = lar + tyr] - |z2 + iye| 
|(w1@2 — yrye) + i(e1ye + yive)| = |e1 + tys| - |e + éy2| 


(a122 — y1y2)? + (w1y2 + y1t2)? = / ag + yiV/ x2 + y2. 


If we square both sides of this last equation, we get exactly our identity (where 
Swi = ite = Avandia = —B), 
There is a similar identity involving sums of four squares, that is due to Euler: 
(a? hae Sed At Be Ose) 
= (aA +bB+cC+dD)? + (aB —bA-—cD+dC)? 
+ (aC + bD — cA — dB)? + (aD —bC + cB —- dA)’. 
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This complicated identity is related to the theory of quaternions? in the same way 
that our identity is related to complex numbers. It is an unfortunate fact that there is 
no analogous identity for sums of three squares, and indeed the question of writing 
numbers as sums of three squares is much more difficult than the same problem for 
either two or four squares. 


Exercises 


24.1. (a) Make a list of all primes p < 50 that can be written in the form p = a? +.ab+ b?. 
For example, p = 7 has this form with a = 2 and b = 1, while p = 11 cannot be 
written in this form. Try to find a pattern and make a guess as to exactly which primes 
have this form. (Can you prove that at least part of your guess is correct?) 

(b) Same question for primes p that can be written in the form* p = a? + 2b?. 


24.2. If the prime p can be written in the form p = a? + 5b”, show that 
p =1or9 (mod 20). 
(Of course, we are ignoring 5 = 07 + 5 - 1?.) 
24.3. Use the Descent Procedure twice, starting from the equation 
5577 + 55% = 26 - 12049, 
to write the prime 12049 as a sum of two squares. 


24.4. (a) Start from 259? + 1? = 34 - 1973 and use the Descent Procedure to write the 
prime 1973 as a sum of two squares. 
(b) Start from 261? + 9477 = 10 - 96493 and use the Descent Procedure to write the 
prime 96493 as a sum of two squares. 


24.5. (a) Which primes p < 100 can be written as a sum of three squares, 
p=a'+b* +c"? 


(We allow one of a, b, c to equal 0, so, for example, 5 = 2? + 1? + 0? is a sum of 
three squares.) 

(b) Based on the data you collected in (a), try to make a conjecture describing which 
primes can be written as sums of three squares. Your conjecture should consist of the 
following two statements, where you are to fill in the blanks: 


3Quaternions are numbers of the form a + ib+ jc + kd, where i, j, and k are three different 
square roots of —1 satisfying strange multiplication rules such as 17 = k = —J1. 

‘The question of which primes p can be written in the form p = a? + nb? has been extensively 
studied and has connections with many branches of mathematics. There is even an entire book on 
the subject, Primes of the Form x” + ny”, by David Cox (New York: John Wiley & Sons, 1989). 


[Chap. 24] Which Primes Are Sums of Two Squares? 192 


(i) If p satisfies then p is a sum of three squares. 


(ii) If p satisfies ______________, then p= is not a sum of three squares. 


(c) Prove part (ii) of your conjecture in (b). [You might also try to prove part (i), but be 
warned, it is quite difficult. ] 


24.6. (a) Let c > 2 be an integer such that the congruence x? = —1 (mod c) has a 
solution. Show that c is a sum of two squares. (Hint. Show that the descent procedure 
described on page 187 still works.) 

(b) Carry out the descent argument for c = 65 starting from the equation 147 + 57? = 
53 - 65 to express 65 as a sum of two squares. (Note 65 is not prime.) 

(c) Is it true that every integer c > 2 satisfying c = 1 (mod 4) is a sum of two squares? 
If not, give a counterexample, and explain which set of the descent procedure fails. 


24.7. == Write a program that solves x? + y* = n by trying x = 0,1,2,3,... and 
checking if n — x? is a perfect square. Your program should return all solutions with x < y 
if any exist and should return an appropriate message if there are no solutions. 


24.8. (a) Write a program that solves x? + y? = p for primes p = 1 (mod 4) using 
Fermat’s Descent Procedure. The input should consist of the prime p and a pair of 
numbers (A, B) satisfying 


A* + B? =0 (mod p). 


(b) In the case that p = 5 (mod 8), modify your program as follows so that the user 
doesn’t have to input (A, B). First, use successive squaring to compute the number 
A = —2- (—4)-5)/8 (mod p). Then A? + 1 = 0 (mod p) (see Exercise 22.8), so 
you can use (A, 1) as your starting value to perform the descent. 


Chapter 25 


Which Numbers Are Sums 
of Two Squares? 


In the last chapter we gave a definitive answer to the question of which primes can 
be written as sums of two squares. We now take up the same question for arbitrary 
numbers. Part of our strategy, which can be summed up in three words, has a long 
and glorious history: 


Divide and Conquer! 


Of course, “Divide” doesn’t mean division per se. Rather, it means to break up 
the problem into pieces of manageable size, and then “Conquer” means we need 
to solve each piece. But these two steps, which may suffice for warfare, have to be 
followed by a third step: fitting the pieces back together. This unification step uses 
the identity from the last chapter that expresses a product of sums of squares as a 
sum of squares: 


(u2 + v?)(A? + B?) = (uA+ vB)? + (vA — uB)’. (x) 


Here, then, is our step-by-step strategy for expressing a number m as a sum of two 
squares. 

Divide: Factor m into a product of primes p12: - - pr. 

Conquer: Write each prime p; as a sum of two squares. 

Unify: Use the identity («) repeatedly to write m as a 

sum of two squares. 
We know from the Sum of Two Squares Theorem for Primes (Theorem 24.1) 

exactly when the Conquer step works, since we know that a prime p is a sum of two 
squares if and only if either p = 2 or p = 1 (mod 4). For example, to write 10 as 
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a sum of two squares, we factor 10 = 2 - 5, write 2 and 5 as sums of two squares, 
2=174+17 and 5=2741?, 
and use the identity to recombine 
10 = 2-5 = (17 + 17)(2? + 1”) = (24.1)? + (2-1)? = 37 + 1?. 


Here’s a more complicated example. We’ll write m = 1105 as a sum of two 
squares. 
Divide: Factorm=1105=5-13-17. 
Conquer: Write each prime p as a sum of two squares. 
ba Pee. a eee tant 
Unify: Use the identity (*) repeatedly to write m as a 
sum of two squares. 
m= 1105 =5+13+17 
= (27 + 17)(37 + 2°)(4 + 1) 
= ((6 + 2)? + (3 — 4)*)(4? + 1) 
=(8 +1)(42 +1") 
= (324-1)? + (4 18)? 
= 33° + 4? 

Our Divide, Conquer, and Unify strategy is successful for the number m pro- 
vided that each prime factor of m is itself a sum of two squares. We know which 
primes can be written as sums of two squares, so we now have a method for writ- 
ing m as a sum of two squares if m factors as 


ki kok kp 
M = Py Py P3” °° Dr" 
where every prime in the factorization is either 2 or is congruent to 1 modulo 4. 


However, if you look back at the list in the last chapter, you’ll see that there are 
other ™m’s that are sums of two squares. For example, 


9 = 374+ 07, 18 = 37 + 32, and 45 =67+437. 


What’s going on? Notice that in each case m is divisible by 3? and m = a? + b? 
with both a and b divisible by 3. If we divide these three examples by 37, we get 


CE ae 0 2 ee a 

st ae ane 

1S Bee es 

og ae 
2 2 

Re et 
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In other words, these three examples were created by taking the equations 
1=17+4+0?, Sa 17bI*. and 5 =2?+ 17 


and multiplying both sides by 37. 
We can do this in general. Given any m = a” + b*, we can multiply by d? to 
get 
d?m = (da)? + (db)?. 


Thus, if m is a sum of two squares, then so is d?m for any d. On the other hand, 
if m = a? + b? is a sum of two squares and if a and b have a common factor, say 
a = dA and b = dB, then we can factor out d? to get 


m= ra + B?), 


Thus, m is divisible by d?, and m/d? is a sum of two squares. 
The moral is that squares dividing m don’t count when we’re trying to write m 
as a sum of two squares. In other words, take m and factor it as 


mM = pipo-*-prM?, 


where the prime factors p1, p2,..., Dr are all different. Then m can be written as a 
sum of two squares provided that each of pj, p2,...,p, can be written as a sum of 
two squares. For example, consider m = 252000. We factor m as 


m. = 252000 = 2°.3?.5°.7=2-5-7- (273-5)? =2-5-7- 60°. 


The prime 7 is not a sum of two squares, so m is not a sum of two squares. (See 
Exercise 25.4.) 
As another example, take m = 25798500. Then 


m = 25798500 = 2? .3*.5°. 77.13 =5-13-(2-37-5-7)? =5- 13-6307. 


In this case, 5 and 13 are sums of squares, and we easily find that 65 = 5-13 = 
8? + 17. Multiplying both sides by 630? gives 


m = 65 - 6307 = (8 - 630)” + (1 - 630)? = 5040? + 6307. 


In this chapter we have given a definitive answer to the question of which num- 
bers are sums of two squares. We summarize our result in the following theorem, 
which also includes further interesting facts whose proof we leave as exercises; see 
Exercises 25.4 and 25.5. 
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Theorem 25.1 (Sum of Two Squares Theorem). Let m be a positive integer. 
(a) Factor mas 
m = pipr- + Pr M? 


with distinct prime factors p1,p2,...,Pr. Then m can be written as a sum of 

two squares exactly when every p; is either 2 or is congruent to 1 modulo 4. 
(b) The number m can be written as a sum of two squares m = a” + b? with 

gcd(a, b) = 1 ifand only if it satisfies one of the following two conditions: 


(i) ™m is odd and every prime divisor of m is congruent to 1 modulo 4. 


(ii) m is even, m/2 is odd, and every prime divisor of m/2 is congruent to 1 
modulo 4, 


The Return of the Pythagorean Triples 


Recall that! a Pythagorean triple is a triple of positive integers (a, b, c) satisfying 
the equation 
a? +b? = c’, 

and the triple is called primitive if gcd(a,b) = 1. We are now in a position to 
completely describe all numbers that can appear as the hypotenuse c in a primitive 
Pythagorean triple. 

The Pythagorean Triples Theorem says that every primitive Pythagorean triple 
can be obtained by choosing relatively prime odd integers s > ¢t > 1 and setting 

242 ane 
aed | ‘ge ae 
= E, b — 5 — 
a=s 5 @ 3 

So we are asking for a description of all numbers c for which we can find an s and 
a t, such that c = (s? + ¢?)/2. In other words, c is the hypotenuse of a primitive 
Pythagorean triple exactly when the equation 


DT a fe 


has a solution in relatively prime odd integers s and t. 

Note first that c must be odd. (We checked this in Chapter 2.) So we are 
asking which numbers 2c with c odd can be written as sums of the squares of two 
relatively prime integers. The Sum of Two Squares Theorem says that this can 
be done if and only if every prime dividing c is congruent to 1 modulo 4. The 
following proposition records what we have proved. 


‘In this context, the phrase “Recall that...” is a polite way of saying “Now might be a good time 
to reread Chapter 2 and review the Pythagorean Triples Theorem in that chapter.” 
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Theorem 25.2 (Pythagorean Hypotenuse Proposition). A number c appears as the 
hypotenuse of a primitive Pythagorean triple (a,b,c) if and only if c is a product 
of primes each of which is congruent to 1 modulo 4. 


For example, the number c = 1479 cannot be the hypotenuse of a primitive 
Pythagorean triple, since 1479 = 3-17 - 29. On the other hand, c = 1105 can be 
a hypotenuse, since 1105 = 5-13-17. Furthermore, we can solve s? +t? = 2c 
to find the values of s and ¢ and then use these to find the corresponding a and b. 
Thus, 1105 = 33? + 4? from earlier in this chapter, and then 


2c = 2-1105 = (12 + 17)(33? + 4”) = 37? + 297. 


Now s = 37 andt = 29, soa = st = 1073 and b = (s? — t?)/2 = 264. This gives 
the desired primitive Pythagorean triple (1073, 264, 1105) with hypotenuse 1105. 


Exercises 


25.1. For each of the following numbers m, either write m as a sum of two squares or 
explain why it is not possible to do so. 
(a) 4370 (b) 1885 (c) 1189 (d) 3185 


25.2. For each of the following numbers c, either find a primitive Pythagorean triple with 
hypotenuse c or explain why it is not possible to do so. 
(a) 4370 (b) 1885 (c) 1189 (d) 3185 


25.3. Find two pairs of relatively prime positive integers (a, c) such that a? + 5929 = c?. 
Can you find additional pairs with gcd(a, c) > 1? 


25.4. In this exercise you will complete the proof of the first part of the Sum of Two 
Squares Theorem (Theorem 25.1). Let m be a positive integer and factor m as 


mM = pip2--*pryM? 


with distinct prime factors p,,p2,...,p,. If some p,; is congruent to 3 modulo 4, prove 
that m cannot be written as a sum of two squares. 


25.5. In this exercise you will prove the second part of the Sum of Two Squares Theorem 
(Theorem 25.1). Let m be a positive integer. 

(a) If m is odd and if every prime dividing m is congruent to 1 modulo 4, prove that m 
can be written as a sum of two squares m = a? + b? with gcd(a, b) = 1. 

(b) If m is even and m/2 is odd and if every prime dividing m/2 is congruent to 1 
modulo 4, prove that m can be written as a sum of two squares m = a? + b? with 
gcd(a,6) = 1, 

(c) If m can be written as a sum of two squares m = a? + b? with gcd(a, b) = 1, prove 
that m is one of the numbers described in (a) or (b). 
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25.6. For any positive integer m, let 


S(m) = # of ways to write m = a? + b? witha > b > 0). 


For example, 
S(5)=1, since 5= 27417, 
S(65) =2, since 65 = 8° + 17 = 7? + 4?, 


while S(15) = 0. 
(a) Compute the following values: 


(i) S(10), Gi) S(70), ii) $(130), Gv) $(1105). 
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(b) If pis a prime and p = 1 (mod 4), what is the value of S(p)? Prove that your answer 


is correct. 


(c) Let p and q be two different primes, both congruent to 1 modulo 4. What is the value 


of S(pq)? Prove that your answer is correct. 


(d) More generally, if p1,...,p, are distinct primes, all congruent to 1 modulo 4, what 


is the value of S(pip2...p,)? Prove that your answer is correct. 


25.7. == Write a program that solves x?+y? = n by factoring n into a product of primes, 
solving each u? + v? = p using descent (Exercise 24.8), and then combining the solutions 


to find (x, y). 


Chapter 26 


As Easy as One, Two, Three 


Many number theoretic assertions have the form: 


Such and such a statement is true for every natural number 7. 


Here are some interesting examples.! 


2 2n?+3n? +n 
= 


e Every natural number n is equal to a product of prime numbers. 


e 17+4+274.---4+n for every n EN. 


e Every natural number n is equal to a sum of at most four squares. 


It is easy to check that these statements are true for any particular value of n. 
For example, they are true for n = 12, since 


Die AO? A Be De AD 
P4284 ...4 12? = 659 = 
12=2-2-3, 


1D Pee fe eee, 


But even if we check that they are true for a lot of values of n, say for all < 1000, 
that won’t prove that they are true for all values of n. 

Of course, you might say that verifying a statement for all n < 1000 provides 
convincing evidence that the statement is true. But no finite number of cases con- 
stitute an incontrovertible proof, and history shows that even copious quantities of 


‘We will prove the first statement in this chapter, and the second statement was proved in Chap- 
ter 7. The third statement is called Lagrange’s Four Squares Theorem. We will not prove Lagrange’s 
theorem, but in Chapter 25 we proved a related Two Squares Theorem due to Fermat. 
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evidence can be misleading. (We provide some cautionary tales at the end of this 
chapter.) 
Suppose that we want to prove that the formula 


9 In? + 3n? +n 
a a 


is true for every n € N. We might start by checking that it is true for the first few 
values of n, 


1742? +---4n (*) 


— 2-18 4+3-17 +1 


Se 
6 b) 
6 > | 
TONS tn ne a Se ch Den 
6 ; 


Now suppose that we have verified that formula (*) is true for all values of n up 
to n = 99, and we want to check it for n = 100. If we start from scratch, then it’s 
a lot of work to compute the left-hand side, 12 + 22 +.-..+4 1002. But if we are 
clever, we will use the fact that (*) is true when n = 99 to simplify the calculation. 
Thus we are assuming that we have already proven the formula 


2-993 + 3-997 + 99 


17427 4.---4+99% = - 


= 328350, 
so we can use this formula to simplify the computation of the left-hand side of (*) 
for n = 100, 


174+274.---+100? = (174+ 2?+---+997) +100? = 328350+ 10000 = 338350. 


We then check that we get the same value using n = 100 in the right-hand side 
of (*), 
2- 1003 + 3- 1007 + 100 
6 
Now that we’ve handled the case n = 100, we can use it to do the case n = 101, 
then we can use the case n = 101 to do n = 102, and so on. That, in a nutshell,? 
is the idea underlying the Principle of Mathematical Induction. 


= 338350. V 


Step I (Initialization) Check the initial case n = 1. 


*T could be bounded in a nutshell, and count myself a king of infinite space, were it not that I have 
bad dreams. Hamlet (II.ii) 
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Step II (Induction Step) Assume that we’ve already completed the proof for all 
values up to n, and using this assumption, which is called the induction 
hypothesis, prove the statement for n + 1. 


If we can do both of these steps, then the mathematical statement that we are trying 
to prove is true for all values of n. Do you see why? Well, it’s true for n = 1, 
because that was Step I. But since it’s true for n = 1, Step II tells us that it’s true 
for n = 2. But then Step II tells us that it’s true for n = 3. And soon... . 

To see how induction works in practice, we prove the sum-of-squares for- 
mula (*) stated earlier. The statement that we are trying to prove is the formula 


9 2 2nF+3n%+17 
tC —— —— 


S(n): 1742? 4-0-4 - 


We start by establishing the initial case n = 1; that is, we verify that statement S(1) 
is true: : ‘ 
1? =] and a =] 


Next we assume that we have already proven statement S(n), and we want to verify 
that statement S(n + 1) is also true. Here’s the argument. 
First we compute the left-hand side of the formula of Statement S(n + 1), 


z 2n3 + 3n2 +7 


742? 4---+n? A soe Via Seer aera Uae 
——$— 
This quantity is equal to 3 2 
1(ond-+ Bn? +n) _ 2n + 9n + 13n+ 6 
because our induction hypothesis 6 


is that statement S(7) is true 
Second we compute the right-hand side of the formula for Statement S(n + 1), 


2(n +1)? +3(n+1)?+(n+1) — 2n?+9n? + 138n+6 
6 - 6 ; 


Comparing the results of the two computations, we have shown that 


19 ee gee el)? = sca eae Ue eee) Desai!) 
which proves that statement S(n + 1) is true. We have now proven: 
e Statement S(1) is true. 
e If Statement S(n) is true, then Statement S(n + 1) is also true. 
This completes the proof by induction that Statement S(7) is true for every natural 
number 7. 
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Another Version of Induction. There is another version of induction that goes 
by the name complete induction or strong induction. In this version, we assume 
that we have proven the statement for all values of n up to and including some 
value N, and using this assumption, we prove that the statement is true when n is 
equal to V + 1. So complete induction requires the following two steps: 


Step I (Initialization) Check the initial case n = 1.7 


Step II (Induction Step) Assume that we’ve already completed the proof for all 
values of n satisfying 1 < n < N, and using this assumption, prove that the 
statement is true forn = N + 1. 


We already used complete induction in Chapter 7 when we proved that every 
integer n > 2 is a product of primes. We briefly recall the proof in order to illustrate 
more formally how complete induction works. 

The statement to be proven is: 


P(n) : nis a product of primes. 


We want to prove that P(n) is true for all m > 2, so in this case our initialization 
step is to prove that P(2) is true. But P(2) is obviously true, since 2 is itself prime. 

We now make the inductive hypothesis that P(n) is true for all2 <n < N, 
and we want to prove that P(N + 1) is true. If N + 1 is prime, then we are done. 
Otherwise N + 1 factors as N + 1 = ab with a and b between 2 and N. The 
inductive hypothesis tells us that P(a) and P(b) are true, so a and b are products 
of primes. Hence N + 1 = abis also a product of primes, which completes our 
proof by induction that P(n) is true for all n > 2. 


Why Experiments Aren’t the Same as Proofs. As promised earlier, we give 
some examples of statements that are false, despite a significant amount of numer- 
ical evidence that they are true. Recall from Chapter 13 that the prime number 
counting function (x) counts the number of primes less than or equal to 7. We 
have proven that not only are there infinitely many primes, but there are infinitely 
many that are congruent to 1 modulo 4, and there are infinitely many that are con- 
gruent to 3 modulo 4. (The latter is Theorem 12.2 and the former is Theorem 21.3.) 

In the 1850s Chebychev noted that primes congruent to 3 modulo 4 seem to be 
more common than primes congruent to 1 modulo 4. To study this phenomenon, 


3Some authors omit the initialization step and instead begin the induction step with N = 0. Note 
that the induction hypothesis for N = 0 is vacuous, so doing the induction step for NV = 0 means 
directly proving the statement for n = 1, which is the same as doing the initialization step. 
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we split the prime counting function into two pieces, 


m1(x) = #{primes p with p < x and p= 1 (mod 4)}, 
m3(x) = #{primes p with p < x and p = 3 (mod 4)}. 


Then with a certain amount of work, one can check that 
m™3(x) > 1(x) forall « < 10000. 


This provides moderately convincing evidence that 73(x) is always greater than 
(x), but in 1957 Leech showed that* 


(26861) = 1473 and —_73(26861) = 1472. 


The situation is even more striking if we instead look at primes modulo 3. Thus 
if we let 


7™(1 mod 3)(“) = # {primes p with p < x and p = 1 (mod 3)}, 
7™(2 mod 3)(“) = # {primes p with p < x and p = 2 (mod 3)}, 


then experiments show that 7(2 mod 3) (2) is larger than (1 moa 3)(«) for all x up to 
600 billion! Indeed, 


7 (2 mod 3)() = 71 mod 3) (2) for all x < 608981813028, 
but in 1978 Bays and Hudson proved that 
7™(1 mod 3) (608981813029) > 7(2 mod 3) (608981813029). 


So although experimentation is useful in making conjectures, these examples 
show why mathematicians insist on rigorous proofs before they accept that a math- 
ematical statement is true, 


Exercises 
26.1. Use induction to prove the following statements. 
2 1 2 
(a) 1942? +...4+n3 = — 
ne —n 
(b) 1-24+2-34+3-44---+(n-l1)n= ; 


“Earlier, in 1914, Littlewood proved that as x increases, the difference 73(2) — 71(2) switches 
back and forth between positive and negative values infinitely many times! 
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n(n + 1)(n + 2) n(n + 1) 


(c) 714+ 724+---+7, = 
number. (We discussed triangular numbers in Chapter 1 and will return to the subject 
in more detail in Chapter 31.) 

(d) For every natural number n, write 


, where T;, = is the n™ triangular 


(ipo ea eva eees 
a8 ee on 


as a fraction in lowest terms. Prove that the denominator B,, divides n!. (Although 
there are other ways to prove this statement, you should give a proof by induction.) 


26.2. The Fibonacci sequence 1,1,2,3,5,8, 13,21, ...is defined by setting F; = Fy = 1, 
and then subsequent terms in the sequence are determined by the formula 


Fn42 = Fn4it+ Fn. 
(In words, each term is the sum of the previous two terms.) Prove by induction that 
Fi + Fot+ Fg +-::+ Fy = Paa2—1 for all natural numbers n. 
We will discuss the Fibonacci sequence in greater detail in Chapter 39. 


26.3. When doing induction, the initialization step may start at some value other than n = 
1. For example, use induction to prove that 


n 


ni < = for all n > 6. 
26.4. Consider the polynomial 
F(x) = 2? —@ —41. 


Its first few values at natural numbers are 


pata 2 [3 | 415 
F(n) 43 | 47 | 53 | 61 7 35 a7 i13 131 


all of which are prime. That seems unusual, so let’s check the next 10 values: 


7] 1]. 


Fea) Lo LTS Or (22235| 2251. (2281 | BI. 347, 38s 424 


They’re all prime, too! 
(a) Compute the next 10 values of F'(n); that is, compute F'(21), F'(22),..., F'(30). Are 
they all prime? 
(b) Do you think that F'(n) is prime for every natural number n? 
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26.5. We give a proof by induction that life exists on other planets! More precisely, con- 
sider the following statement: 


L(n) : Given any set of n planets, if one of the planets 
supports life, then all of the planets in the set support life. 


We are going to prove, by induction, that the statement £(n) is true for all natural num- 
bers n. 

We start with £(1), the initial case. It asserts that if we have one planet, and that planet 
supports life, then that planet supports life. So statement £(1) is certainly true. 

Next we make the induction hypothesis that £(n) is true, and we consider a set con- 
sisting of n + 1 planets, at least one of which supports life. We let P,,..., Pn+1 be the 
planets in the set, with P, being the planet that we know supports life. Now consider 
the subset {P,, Po,...,P,}. This is a set of n planets, at least one of which supports 
life, so by the induction hypothesis, all of P,,...,P, support life. Next consider the sub- 
set {P,, P3,..., Pn41}. This is also a set of n planets, at least one of which supports life, 
so again the induction hypothesis tells us that they all support life. We have proven that all 
of the planets P;, P2,..., P41 suppport life, so we have proven that statement £(n + 1) 
is true. 

This completes the proof by induction that the statement £(n) is true for every natural 
number n. Now consider the set of planets 


{Mercury, Venus, Earth, Mars, Jupiter, Saturn}. 


This is a set of planets, at least one of which supports life, so our proof by induction 
conclusively demonstrates that there is life on Mars (as well as on Mercury, Venus, etc.). 

Is this conclusion correct? If not, then there must be something wrong with our induc- 
tion proof. What’s wrong? 


Chapter 27 


Euler’s Phi Function 
and Sums of Divisors 


When we studied perfect numbers in Chapter 15, we used the sigma function o(n), 
where o(7) is defined to be the sum of all the divisors of n. We now propose to 
conduct what may seem like a strange experiment. We take all the divisors of n, 
apply Euler’s phi function to each divisor, add the values of Euler’s phi function 
and see what we get. 

We start with an example, say n = 15. The divisors of 15 are 1, 3, 5, and 15. 
We first evaluate Euler’s phi function at the numbers 1, 3, 5, and 15, 


@1jy=1, 9(8)=2, 9(5)=4, (15) = 8. 
Next we add the values to get 
o(1) + o(3) + o(5) + d(15) =14+24+44+8=15. 


The result is 15, the number we started with; but surely that’s just a coincidence. 
Let’s try a larger number that has lots of factors, say n = 315. The divisors 
of 315 are 
1, 3, 5, 7, 9, 15, 21, 35, 45, 63, 105, 315, 


and if we evaluate Euler’s phi function and add the values, we get 


P(1) + O(3) + (5) + O(7) + O(9) + G(15) + 6(21) + 9(35) 
+ (45) + (63) + 4(105) + $(315) 
=1424+4464+6+4+8+412+4244 24+ 36+ 48+ 144 
= 35: 
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Again we end up with the number we started with. This is beginning to look like 
more than a coincidence. We might even make the following guess. 


Guess. Let dj, d2,...,d, be the numbers that divide n, including 
both 1 and n. Then 


b(di) + (da) +-+-+ (dy) =n. 


How might we go about proving that our guess is correct? The easiest case 
to check would be when n has very few divisors. For example, suppose that we 
take n = p, where p is a prime. The divisors of p are 1 and p, and we know that 
(1) = 1 and ¢(p) = p — 1. Adding these gives 

e(1) + o(p) =1+ (p—-1) =p. 
So we have verified our guess when n is a prime. 


Next we try n = p?. The divisors of p are 1, p, and p”, and we know from 
Chapter 11 that ¢(p?) = p* — p, so we find that 


(1) + d(p) + O(p*) = 1+ (p—1) + (? —p) =p. 
Notice how the terms cancel until only p? is left. 
Emboldened by these successes, let’s try to verify our guess when n = p* 
is any power of a prime. The divisors of p* are 1, p,p?,...,p”. In Chapter 11 
we found a formula for Euler’s phi function at a prime power: $(p') = p’ — p*-1. 
Using this formula enables us to compute 


$(1) + O(p) + $(p*) +--+ G(p*") + O(P*) 
slit We = py Reena ee (Dp) 
— o* < 
Again the terms cancel, leaving exactly p*. We have now verified that our guess is 
true whenever n is a prime power. 
If n is not a power of a prime, the situation is somewhat more complicated. As 


always, we start with the simplest case. Suppose that n = pq is the product of two 
different primes. Then the divisors of n are 1, p, g, and pq, so we need to sum 


o(1) + o(p) + O(q) + O(p9). 


We showed in Chapter 11 that Euler’s phi function satisfies a multiplication for- 
mula (mn) = ¢(m)d¢(n) provided that m and n are relatively prime. In particu- 
lar, p and q are relatively prime, so ¢(pq) = ¢(p)¢(q). This means that 
(1) + d(p) + O(g) + O(pq) = 1+ O(p) + O(a) + O(p) O(a) 
= (1+ $(p)) (1+ @) 
= Pq, 
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which is exactly what we want. 
Using this example as a guide, we are now ready to tackle the general case. For 
any number n, we define a function F'(n) by the formula 


F(n) = o(d,) + (de) +--- + $(d,), where dj, do,...,d, are 
the divisors of n. 


Our goal is to show that F'(n) = n for every number n. The first step is to check 
that the function F' satisfies a multiplication formula. 


Lemma 27.1. [f gcd(m,n) = 1, then F(mn) = F(m)F(n). 
Proof. Let 


d,,d2,...,d, be the divisors of n, 
and 
€1, €2,.--,€s be the divisors of m. 


The fact that m and n are relatively prime means that the divisors of mn are pre- 
cisely the various products 


d1e1, d\e9, ceey d1és, d2e€1, dze2, cee dz€s, a ibRe , d,e1, drea, aos ae 


Furthermore, every d; is relatively prime to every e;, so d(dje;) = (di) d(e;). 
Using these facts, we can compute 


F(mn) = $(dye1) + --- + (dyes) + (deer) + +--+ (dees) 
+++++ b(dpe1) +--+ b(dres) 
= $(d1)p(e1) +--+ $(di) (es) + O(d2)G(e1) +--+ + O(d2)d(Ees) 
+--++ (dr) b(e1) +--+ + $(dr)G(es) 
= ($(d1) + $(da) + --- + P(dr)) - (Per) + O(e2) +--+ (es) 
= Fim) F(x). 


This completes the proof of the lemma. L 


Using the lemma, it is now a simple matter to prove the following summation 
formula for Euler’s phi function. 


Theorem 27.2 (Euler’s Phi Function Summation Formula). Let d;,do,...,d, be 
the divisors of n. Then 


b(d,) + o(d2) +---+ 6(d,) =n. 
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Proof. We let F(n) = $(d1) + $(d2) +--- + (d;), and we need to verify that 
F'n) always equals n. The calculation of ¢(1) + ¢(p) + 6(p?) +--+ + (p*) on 
page 207 shows that for prime powers we have F'(p”) = p*. Now factor n into a 
product of prime powers, say n = pe! pr? - pit. The different prime powers are 
relatively prime to one another, so we can use the multiplication formula for F’ to 
compute 


F(n) = F(py'py? «+ -pe*) 
=F (pt: )F (p5?) ae a (pf*) from the multiplication formula, 


= prpy va ‘pi : since F’ (p*) os pe for prime powers, 
=. O 


Exercises 


27.1. A function f(n) that satisfies the multiplication formula 
f(mn) = f(m)f(n) for all numbers m and n with gcd(m,n) = 1 


is called a multiplicative function. For example, we have seen that Euler’s phi func- 
tion @(n) is multiplicative (Chapter 11) and that the sum of divisors function a(n) is 
multiplicative (Chapter 15). 

Suppose now that f(n) is any multiplicative function, and define a new function 


g(n) = f(di) + f(do) +--- +f (d,), where dj, d2,...,d, are the divisors of n. 
Prove that g(m) is a multiplicative function. 


27.2. Liouville’s lambda function A(7) is defined by factoring n into a product of primes, 
a ph" py? pr, and then setting 


Xn) = (21) Ber Ra eee. 


(Also, we let A(1) = 1.) For example, to compute (1728), we factor 1728 = 2° - 33, and 
then-A(Ii728) =(=1)°9 9 = (1 Sel, 
(a) Compute the following values of Liouville’s function: (30); A(504); A(60750). 
(b) Prove that (7) is a multiplicative function as defined in Exercise 27.1; that is, prove 
that if ged(m,n) = 1, then A(mn) = A(m)X(n). 
(c) We use Liouville’s lambda function to define a new function G(n) by the formula 


G(n) = A(d1) +A(d2)+---+A(d;), where d,, d2,...,d, are the divisors of n. 


Compute the value of G(n) for all 1 <n < 18. 
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(d) Use your computations in (c), and additional computations if necessary, to make a 
guess as to the value of G(n). Check your guess for a few more values of n. Use 
your guess to find the value of G(62141689) and G(60119483). 


(e) Prove that your guess in (d) is correct. 


27.3. Let d,,d2,...,d, be the numbers that divide n, including 1 and n. The t-power 
sigma function o;(n) is equal to the sum of the t powers of the divisors of n, 


a(n) =di+d,t+-:-+d. 


For example, o2(10) = 1? + 2? +5? +10? = 130. Of course, 7; (mn) is just our old friend, 
the sigma function a(n). 

(a) Compute the values of o2(12), 03(10), and o9(18). 

(b) Show that if ged(m, n) = 1, then o4(mn) = o4(m)o;z(n). In other words, show that 
oz is a multiplicative function. Is this formula still true if m and n are not relatively 
prime? 

(c) We showed in Chapter 15 that o(p*) = (p**! — 1)/(p — 1). Find a similar formula 
for o;(p"), and use it to compute o4 (2°). 

(d) The function o9(n) counts the number of different divisors of n. Does your formula 
in (c) work for oo? If not, give a correct formula for oo (p*). Use your formula 
and (b) to find the value of 09(42336000). 


27.4. Let n be a positive integer. If the fractions 


23 n-1 
n’ eee ad n p) 


eS 
n p) 
are reduced to lowest terms, their denominators are divisors of n. For each divisor d of n, 
let N(d) be the number of fractions in the list whose denominator is exactly equal to d. 
(a) Let d,,d2,...,d, be the numbers that divide n, including 1 and n. What is the value 
of 
N(d,) + N(d2) +--- + N(d,)? 


(b) Forn = 12, write the fractions a) oe a is in lowest terms and compute the values 


of N(1), N(2), N(3), N(4), N(6), and N(12). 
(c) Prove that N(n) = $(n). 
(d) More generally, prove that N(d) = ¢(d) for every d that divides n. 


(e) Use (a) and (d) to give an alternative proof of Euler’s Phi Function Summation For- 
mula (Theorem 27.2). 


Chapter 28 


Powers Modulo p 
and Primitive Roots 


If a and p are relatively prime, Fermat’s Little Theorem (Chapter 9) tells us that 
a?-! =1 (mod p). 


Of course, it’s quite possible that some smaller power of a is congruent to 1 mod- 
ulo p. For example, 2? = 1 (mod 7). On the other hand, there may be some values 
of a that require the full (p — 1)** power. For example, the powers of 3 modulo 7 
are 


Thus, the full sixth power of 3 is required before we get to 1 modulo 7. 

Let’s look at some more examples to see if we can spot a pattern. Table 28.1 
lists the smallest power of a that is congruent to 1 modulo p for the primes p = 5, 7, 
and 11 and for each a between 1 and p — 1. We might make two observations. 


1. The smallest exponent e such that a® = 1 (mod p) seems to divide p — 1. 
2. There are always some a’s that require the exponent p — 1. 


Since we are studying this smallest exponent in this chapter, we give it a name. 
The order of a modulo p is the quantity 


a= the smallest exponent e > 1 
PX’ \ such that a® = 1 (mod p) 
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1 (mod 5) 11 = 1 (mod 7) 14 = 1 (mod 11) 
1 (mod 5) 23 = 1 (mod 7) 210 = 1 (mod 11) 
1 (mod 5) |3° = 1 (mod 7) | | 3° =1 (mod 11)| 
1 (mod 5) |43 = 1 (mod 7) | 4° = 1 (mod 11)| 
5° = 1 (mod 7) 
62 = 1 (mod 7) 


Table 28.1: Smallest Power of a That Equals 1 Modulo p 


(Note that we only allow values of a that are relatively prime to p.) 

Referring to Table 28.1, we see, for example, that e5(2) = 4, e7(4) = 3, and 
€11(7) = 10. Fermat’s Little Theorem says that a?~! = 1 (mod p), so we know 
that e,(a) < p—1. Our first observation was that e,(a) seems to divide p — 1. Our 
second observation was that there always seem to be some a’s with e,(a) = p— 1. 
We are going to check that both of these observations are true. We begin with the 
first, which is the easier of the two. 


Theorem 28.1 (Order Divisibility Property). Let a be an integer not divisible by 
the prime p, and suppose that a” = 1 (mod p). Then the order ep(a) divides n. 
In particular, the order e,(a) always divides p — 1. 


Proof. The definition of the order e,(a) tells us that 
a°(*) = 1 (mod p), 


and we are assuming that a” = 1 (mod p). We divide n by e,(a) to get a quotient 
and remainder, 
n=e,(ajgt+tr  withO<r<e,(a). 
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Then 


L=0? = ger = (atr(e) )" -a’ =1%-a" =a’ (mod p). 


But r < e,(a), and by definition, e,(a) is the smallest positive exponent e that 
makes a° = 1 (mod p), so we must have r = 0. Therefore n = e,(a)q, which 
shows that e,(a) divides n. 

Finally, Fermat’s Little Theorem (Chapter 9) tells us that a?~! = 1 (mod p), 
so taking n = p — 1, we conclude that e,(a) divides p — 1. O 


Our next task is to look at the numbers that have the largest possible order: 
€p(a) = p — 1. If ais such a number, then the powers 


OG a nae ae aa (mod p) 
must all be different modulo p. [If the powers are not all different, then we would 
have a’ = a! (mod p) for some exponents 1 < 7 < 7 < p— 1, which would mean 
that a’~* = 1 (mod p), where the exponent j — 7 is less than p — 1.] The numbers 
that require the largest exponent are of sufficient importance for us to give them a 
name. 


A number g with maximum order 
ep(g) =p—1 
is called a primitive root modulo p. 


Looking back at the tables for p = 5,7, 11, we see that 2 and 3 are primitive roots 
modulo 5, that 3 and 5 are primitive roots modulo 7, and that 2, 6, 7, and 8 are 
primitive roots modulo 11. 

We now come to the most important result in this chapter. 


Theorem 28.2 (Primitive Root Theorem). Every prime p has a primitive root. 
More precisely, there are exactly ¢(p — 1) primitive roots modulo p. 


For example, the Primitive Root Theorem says that there are (10) = 4 prim- 
itive roots modulo 11 and, sure enough, we saw that the primitive roots mod- 
ulo 11 are the numbers 2, 6, 7, and 8. Similarly, the theorem says that there are 
(36) = 12 primitive roots modulo 37 and that there are ¢(9906) = 3024 primi- 
tive roots modulo 9907. In fact, the primitive roots modulo 37 are the 12 numbers 
2,5, 13,15, 17, 18, 19, 20, 22, 24, 32,35. We won’t waste the space to list the 3024 
primitive roots modulo 9907. One drawback of the Primitive Root Theorem is that 
it doesn’t give a method for actually finding a primitive root modulo p. All we can 
do is start checking a = 2,a = 3,a = 5,a = 6,... until we find a value of a 
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with e,(a@) = p — 1. (Do you see why 4 can never be a primitive root?) However, 
once we find one primitive root modulo p, it is not hard to find all the others (see 
Exercise 28.5). 


Proof of the Primitive Root Theorem. We prove the Primitive Root Theorem using 
one of the most powerful techniques available in number theory: COUNTING. The 
use of counting was already illustrated in our proof of Theorem 11.1 on page 77. 
For the current proof, we will take a set of numbers and count how many numbers 
are in the set in two different ways. This idea of counting something in two different 
ways and comparing the results has wide applicability in number theory, and indeed 
in all mathematics. 

For each number a between 1 and p — 1, we know that the order e,(a) divides 
p —1. So for each number d dividing p — 1, we might ask how many a’s have their 
order e€,(a) equal to d. We call this number w(d). In other words, 


w(d) = (the number of a’s with 1 < a < pande,(a) = d). 


In particular, w(p — 1) is the number of primitive roots modulo p. 
Let n be any number dividing p — 1, say p — 1 = nk. Then we can factor the 
polynomial X?~! — 1 as 
MeV a are 
=o 
= Exe Ey DO? He (A 2 eters Osan Tey Gigalt Te 


We count how many roots these polynomials have modulo p. 
First we observe that 


X?-' _1 =0 (mod p) has exactly p — 1 solutions, 


since Fermat’s Little Theorem tells us that X = 1,2,3,...,p — 1 are all solu- 
tions. On the other hand, the Polynomial Roots Mod p Theorem (Theorem 8.2 on 
page 60) says that a polynomial of degree D with integer coefficients has at most D 
roots modulo p, so 


X” —1=0 (mod p) 
has at most n solutions, and 


CRP! 4p (Re. op RY 41 = 0 Gned in) 


has at most nk — n solutions. 
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We now know that 


XP-1_ yo = (X™—-1) x ((X™)P 14 (XM) EX 41) 

_—_—_—— Ns esenseenss 
A a eg at most 7 at most nk — n roots mod p 

roots mod p roots mod p 


The only way for this to be true is if X” — 1 has exactly n roots modulo p, since 
otherwise the right-hand side won’t have enough roots. This proves the following 
important fact: 


If n divides p — 1, then the congruence 
X”" —1=0 (mod p) 


has exactly n solutions with 0 < X < p. 


Now let’s count the number of solutions to X" — 1 = 0 (mod p) in a different 
way. If X = a isa solution, then a” = 1 (mod p), so by the Order Divisibility 
Property, we know that e,(a) divides n. So if we look at the divisors of n and if for 
each divisor d of n we take those a’s with e,(a) = d, then we end up with all the 
solutions of the congruence X” — 1 = 0 (mod p). In other words, if d1, do,..., dy 
are the divisors of n, then the number of solutions to X" — 1 = 0 (mod p) is equal 
to 


(di) + Y(da) + +--+ (dr). 


We have now counted the number of solutions to X” — 1 =0 (mod p) in two 
different ways. First, we showed that there are n solutions, and second we showed 
that there are ~(d;) + --- + %(d,) solutions. These numbers must be the same, so 
merely by counting the number of solutions, we have proven the following beauti- 
ful formula: 


Let n divide p — 1 and let d,, do,...,d, be the divisors of n, including 
both 1 and n. Then 


(di) + (do) +--+ + Y(d-) =n. 


This formula should look familiar; it’s exactly the same as the formula we proved 
for Euler’s phi function in Chapter 27. We now use the fact that @ and w both 
satisfy this formula to show that ¢ and w are actually equal. 

Our first observation is that ¢(1) = 1 and w(1) = 1, so we’re okay for n = 1. 
Next we check that ¢(q) = w(q) when n = q is a prime. The divisors of q are 1 
and q, so 

¢(q) + (1) = = (a) + ¥(1). 
But we know that ¢(1) = (1) = 1, so subtracting 1 from both sides gives ¢(q) = 
b(q). 
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How about n = q?? The divisors of g? are 1, q, and q?, so 


o(@) + o(9) + 6(1) = @ = V(q?) + ¥(g) + V1). 


But we already know that ¢(q) = ~(q) and ¢(1) = 2(1), so canceling them from 
both sides gives ¢(q”) = (q?). 

Similarly, if m = q1qe for two different primes q; and qo, then the divisors of n 
are 1, q@1, G2, and qiqo. This gives 


b(q192) + O(q1) + O(G2) + (1) = “1492 
= 0(q192) + Y(q) + Y(g2) + YA), 


and canceling the terms that we already know are equal leaves $(qiq¢2) = v(q142). 

These examples illustrate how to prove that ¢(n) = ~(n) for every n by work- 
ing up from small values of n to larger values of n. More formally, we can give a 
proof by induction. So we assume that we already proved that ¢(d) = 7)(d) for all 
numbers d < n, and we attempt to prove that (n) = y(n). Let dj), do,...,d, be 
the divisors of n as usual. One of these divisors is n itself, so relabeling them, we 
may as well assume that d; = n. Using the summation formulas for ¢ and w, we 
find that 


o(n) + O(dz) + d(d3) +--+ + b(dr) =n 
= ¥(n) + Y(de) + Y(ds) +--+ + Y(dr). 


But all of the numbers d2,d3,...,d, are strictly less than n, so our assumption 
tells us that ¢(d;) = w(d;) for each i = 2,3,...,1r. This means that we can cancel 
these values from both sides of the equation, which leaves the desired equality 
o(n) = ¥(n). 

To recapitulate, we have proved that for each number n dividing p — 1 there 
are exactly ¢(n) numbers a with e,(a) = n. Taking n = p — 1, we see that there 
are exactly ¢(p — 1) numbers a with e,(a) = p— 1. But a’s with e,(a) =p —1 
are precisely the primitive roots modulo p, so we have proved that there are exactly 
(p — 1) primitive roots modulo p. Since the number ¢(p — 1) is always at least 1, 
we see that every prime has at least one primitive root. This completes our proof 
of the Primitive Root Theorem. O 


The Primitive Root Theorem tells us that there are lots of primitive roots mod- 
ulo p, in fact, precisely ¢(p — 1) of them. Unfortunately, it doesn’t give us any 
information at all about which specific numbers are primitive roots. Suppose we 
turn the question around, fix a number a, and ask for which primes p is a a prim- 
itive root. For example, for which primes p is 2 a primitive root? The Primitive 
Root Theorem gives us no information at all! 
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Here is a list of the order e,(2) for all primes up to 100, where we write ep 
instead of €,(2) to save space. 


Cea e ee=4 ena €11 = 10 e739 = 12 ei77=s 
€19 = 18 €23 = ll €29 = 28 €31 = 5 €37 = 36 €41 = 20 
€43 = 14 €47 = 23 €53 = 52 €59 = 58 €61 = 60 €67 = 66 
e711 = 35 €73 = 9 €79 = 39 €33 = 82 €g9 = Lai €97 = 48 


Looking at this list, we see that 2 is a primitive root for the primes 
p= 3, 5,11, 13, 19,29; 37; 54,59, 61, 67,33. 


Do you see any pattern? Don’t be discouraged if you don’t; no one else has found 
a simple pattern, either. However, in the 1920s Emil Artin made the following 
conjecture. 


Conjecture 28.3 (Artin’s Conjecture). There are infinitely many primes p such that 
2 is a primitive root modulo p. 


Of course, there’s nothing special about the number 2, so Artin also made the 
following conjecture. 


Conjecture 28.4 (The Generalized Artin Conjecture). Let a be any integer that is 
not a perfect square and is not equal to —1. Then there are infinitely many primes p 
such that a is a primitive root modulo p. 


Artin’s Conjecture is still unsolved, although much progress has been made on 
it in recent years. For example, in 1967 Christopher Hooley proved that if a cer- 
tain other conjecture called the Generalized Riemann Hypothesis is true, then the 
Generalized Artin Conjecture is also true. Equally striking, Rajiv Gupta, M. Ram 
Murty, and Roger Heath-Brown proved in 1985 that there are at most three pair- 
wise relatively prime values of a for which the Generalized Artin Conjecture is 
false. Of course, these three putative “bad values” of a probably don’t exist, but 
no one yet knows how to prove that they don’t exist. And no one has been able 
to prove that a = 2 is not a bad value, so even Artin’s original conjecture remains 
unproved! 


Costas Arrays 

We are now going to describe Costas arrays, which are mathematical objects that 
have applications to sonar and radar technology.' Surprisingly, Costas arrays are 
also related to primitive roots! In order to create a Costas array, we start with a 
square array of boxes, for example the following six-by-six array, where we have 
labeled the rows and columns: 


'J. P. Costas, A study of a class of detection waveforms having nearly ideal range-Doppler ambi- 
guity properties, Proceedings of the IEEE, 72, 8 (1984), 996-1009. 
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123456 


aoOohrwNnr 


We next put dots into the centers of some of the boxes, but the dots have to 
obey the following Costas rules: 


1. Every row has exactly one dot. 
2. Every column has exactly one dot. 


3. Form all of the line segments by connecting every pair of dots. Then no two 
line segments have the same length and the same slope. 


It’s not hard to fill in the dots to obey Rules 1 and 2. Such arrays are called 
permutation arrays. But its much trickier trying to satisfy Rule 3. Here is one 
Costas array of size six and two permutation arrays that are not Costas arrays. For 
the non-Costas examples, we have drawn line segments that have the same length 
and the same slope. 


i 
| 
| 


Costas array Not a Costas array Not a Costas array 


Lloyd R. Welch discovered an interesting way to use primitive roots to con- 
struct Costas arrays of size p — 1, where p is a prime. Here’s how his construction 
works. Let g be a primitive root modulo p. Then in the i row we put a dot in the 
j™ column, where i and j are related by 


j =g' (mod p). 


Here we take 7 and 7 to lie between 1 and p — 1. 
We illustrate by taking p = 11 and g = 2. We start by making a table of powers 
of 2 modulo 11. 


Rowe |) 235 |e) B63) P8294 16 
| Column j |2/4]8/5/10|9/7/3/6| 1 


Rows and Columns Satisfying 7 = 2° (mod 11) 


Using Welch’s procedure, this gives the following 10-by-10 Costas array: 
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OANnowhWwWNr 


We now verify that Welch’s construction gives a Costas array. The fact that g 
is a primitive root means that the numbers g, g”,...,g?~' (mod p) are distinct, so 
we certainly get a permutation array; that is, an array that satisfies Rules 1 and 2. 
We now verify that it also satisfies Rule 3. 

We write D(i, 7) to denote a dot in the 2" row and the j column. Suppose 
now that the line segment connecting the dots D(i1, 71) and D(i2, j2) has the same 
length and the same slope as the line segment connecting the dots D(u1, v1) and 
D(uz, v2). We will use this assumption to derive a contradiction. We observe that 
two line segments have the same length and slope if and only if the right triangles 
that they form are congruent, as in the following picture: 


D(i1,51) D(u1,01) 


D(i2,52) D(i2,j1) D(u2,v2) D(u2,01) 
So our assumption that the line segments have the same length and slope is equiv- 
alent to the assertion that 


ig—-My=u—-m and jo-fi=v2- UY. (*) 
We now use the Welch rule 7 = g’ (mod p), which applies to all four of the dots 
D(i1, j1), D(i2, j2), D(ui, v1), and D(ug, v2), 


to rewrite the second equation in (*) as 


g? —g =g" — g™ (mod p). 


We next pull out a factor from each side, 


gg?" — 1) = gg — 1) (mod p). 


But (*) says that 72 — 21 = ug — uj, so the quantities in parentheses are the same, 
and they are not zero modulo p, because we know that 7; 4 ig and uy # ua. 
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(This is where we use the assumption that the dots are in different rows.) So we 
can cancel the quantities in parentheses from both sides of the congruence, which 
yields 
g'! = g™ (mod p). 

Since g is a primitive root, this means that 7; = u, (mod p — 1), and since 7, 
and wu; are between 1 and p — 1, this means that they are equal. Since the array 
has only one dot in each row, this implies that D(71, 71) = D(u1, v1), and then (*) 
gives D(i2, j2) = D(ug, v2). This contradicts the assumption that we started with 
two different line segments, which completes the proof that Welch’s procedure 
gives a Costas array. 

Welch’s construction produces Costas arrays of size p — 1 for all primes p. 
There is another construction that gives Costas arrays of size p—2 for prime powers 
p”; see Exercise 28.18. However, these and other constructions do not give all 
possible sizes. It is known that Costas arrays exist in all sizes up to 31, but none 
are known of size 32 or 33. 


Exercises 


28.1. Let p be a prime number. 
(a) What is the value of 1 + 2+3+---+(p—1) (mod p)? 
(b) What is the value of 17 + 2? + 32 +---+(p-— 1)? (mod p)? 
(c) For any positive integer k, find the value of 


1 42% + 3% 4...+(p—1)* (mod p) 
and prove that your answer is correct. 


28.2. For any integers a and m with gcd(a, m) = 1, we let e,,(a) be the smallest exponent 
e > 1 such that a® = 1 (mod m). We call e,,(a) the order of a modulo m. 
(a) Compute the following values of e,,,(a): 


(i) e9(2) (ii) e15(2) (ili) e16(3) (iv) €10(3) 
(b) Show that e,,,(a) always divides ¢(m). 


28.3. In this exercise you will investigate the value of e,,(2) for odd integers m. To save 
space, we write e€,,, instead of €,,,(2), so for this exercise €,,, is the smallest power of 2 that 
is congruent to 1 modulo m. 
(a) Compute the value of e,,, for each odd number 11 < m < 19. 
(b) Here is a table giving the values of e,,, for all odd numbers between 3 and 149 [except 
for 11 <_m < 19 which you did in part (a)]. 
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€3 = 2 e5=4 e7 =3 €g = 6 C11 = ** = €13 = ** E15 = 
€17 = ** €19 = ** €21 = 6 €23 = 11 €25 = 20 627 = 18 €29 = 28 
€31 = 5) €33 = 10 €35 = 12 €37 = 36 €39 = 12 €41 = 20 €43 = 14 
€45 = 12 €47 = 23 €49 = 2 €51 = 8 €53 = 52 €55 — 20 657 = 18 
€59 = 58 €61 = 60 €63 = 6 €65 = 1 €67 — 66 €69 = 22 e711 = 30 
€73 = 9 €75 = 20 e€77 = 30 e€79 = 39 €31 = 54 €33 = 82 €g5 = 8 
e€37 = 28 €s9 = 11 €91 = 12 €93 = 10 €95 = 36 €97 = 48 €99 = 30 
€101 = 100 €103 = sak €105 = 12 €107 = 106 €109 = 36 €111 = 36 €113 = 28 
€115 = 44 €117 = 12 €119 = 24 €121 = 110 €123 = 20 €125 = 100 €127 = 7 
€129 = 14 €131 = 130 €133 = 18 €135 = 36 €137 = 68 €139 = 138 €141 = 46 
€143 = 60 €145 = 28 €147 = 42 €149 = 148 


Using this table, find (i.e., guess) a formula for e,,,, in terms of e,, and e,, whenever 


ecd(gn. 1). 
(c) Use your conjectural formula from (b) to find the value of e11227. (Note that 11227 = 
103 - 109.) 


(d) Prove that your conjectural formula in (b) is true. 

(e) Use the table to guess a formula for €pe in terms of €p, p, and k, where p is an odd 
prime. Use your formula to find the value of eggg21. (Note that 68921 = 41°.) 

(f) Can you prove that your conjectural formula for e,« in (e) is correct? 


28.4. (a) Find all primitive roots modulo 13. 
(b) For each number d dividing 12, list the a’s with 1 < a < 13 and e13(a) = d. 


28.5. (a) If g is a primitive root modulo 37, which of the numbers g’,g°,...,9° are 
primitive roots modulo 37? 
(b) If g is a primitive root modulo p, develop an easy-to-use rule for determining if g* is 
a primitive root modulo p, and prove that your rule is correct. 
(c) Suppose that g is a primitive root modulo the prime p = 21169. Use your rule from 
(b) to determine which of the numbers g?, g°,..., 92° are primitive roots modulo 
21169. 


28.6. (a) Find all primes less than 20 for which 3 is a primitive root. 
(b) If you know how to program a computer, find all primes less than 100 for which 3 is 
a primitive root. 


28.7. If a = b? is a perfect square and p is an odd prime, explain why it is impossible for a 
to be a primitive root modulo p. 


28.8. Let p be an odd prime and let g be a primitive root modulo p. 
(a) Prove that g* is a quadratic residue modulo p if and only if k is even. 
(b) Use (a) to give a quick proof that the product of two nonresidues is a residue, and 
more generally that (¢) (2) = (2). 
(c) Use (a) to give a quick proof of Euler’s Criterion a?—))/? = (S) (mod p). 
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28.9. Suppose that g is a prime number that is congruent to 1 modulo 4, and suppose that 
the number p = 2q + 1 is also a prime number. (For example, g could equal 5 and p 
equal 11.) Show that 2 is a primitive root modulo p. [Hint. Euler’s Criterion and Quadratic 
Reciprocity will be helpful. ] 


28.10. Let p be a prime, let k be a number not divisible by p, and let b be a number that has 
a k™ root modulo p. Find a formula for the number of k" roots of b modulo p and prove 
that your formula is correct. [Hint. Your formula should depend only on p and k, not on b.] 


28.11. B wWritea program to compute e,(a), which is the smallest positive exponent e 
such that a® = 1 (mod p). [Be sure to use the fact that if a # 1 (mod p) foralll <e< 
p/2, then e,(a) is automatically equal to p — 1.] 


28.12. & Write a program that finds the smallest primitive root for a given prime p. 
Make a list of all primes between 100 and 200 for which 2 is a primitive root. 


28.13. If a is relatively prime to both m and n and if gcd(m,n) = 1, find a formula for 
€mn(a@) in terms of e,,(a) and e,, (a). 


28.14. For any number m > 2, not necessarily prime, we say that g is a primitive root 
modulo m if the smallest power of g that is congruent to 1 modulo m is the ¢(m)"™ power. 
In other words, g is a primitive root modulo m if gcd(g, m) = 1 and g* # 1 (mod m) for 
all powers 1 < k < $(m). 
(a) For each number 2 < m < 25, determine if there are any primitive roots modulo m. 
(If you have a computer, do the same for all m < 50.) 
(b) Use your data from (a) to make a conjecture as to which m’s have primitive roots and 
which ones do not. 
(c) Prove that your conjecture in (b) is correct. 


28.15. Recall that a permutation array is an array in which each row has exactly one dot 
and each column has exactly one dot. 
(a) How many N-by-N permutation arrays are there? [Hint. Place dots one row at a 
time, and think about how many choices you have for each successive row.] 
(b) (The rest of this exercise is for students who know how to multiply matrices.) We 
can turn a dotted array into a matrix by replacing each dot with a 1 and putting a 0 in 
all of the other places. For example, the 6-by-6 permutation array 


7 


becomes the 6-by-6 matrix A= 


O34 
Sy 1 Sai 
SS 1S HS: 
ooorco 
COSCO. 
oOorcCcoo eo 


Compute the first few powers of this matrix A. In particular, what is the value of A°? 
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(c) Let A be an N-by-N permutation matrix, that is, a matrix that is created from a 
permutation array. Prove that there is an integer k > 1 such that A®* is the identity 
matrix. 

(d) Find an example of an N-by-N permutation matrix A such that the smallest number k 
for which A* is the identity matrix satisfies k > N. 


28.16. (a) Find all Costas arrays of size 3. 
(b) Write down one Costas array of size 4. 
(c) Write down one Costas array of size 5. 
(d) Write down one Costas array of size 7. 


28.17. Use Welch’s construction to find a Costas array of size 16. Be sure to indicate 
which primitive root you used. 


28.18. This exercise describes a special case of a construction of Lempel and Golumb for 
creating Costas arrays of size p — 2. 
(a) Let g; and go be primitive roots modulo p. (They are allowed to be equal.) Prove that 
for every 1 <2 < p— 2 there is a unique 1 < 7 < p — 2 Satisfying 


gi +93 =1. 


(b) Create a (p — 2)-by-(p — 2) array by putting a dot in the i™ row and the j column 
if i and j satisfy gi + g3 = 1. Prove that the resulting array is a Costas array. 

(c) Use the Lempel—Golumb construction to write down two Costas arrays of size 15. 
For the first, use g} = gg = 5, and for the second, use g; = 3 and gz = 6. 


Chapter 29 


Primitive Roots and Indices 


The beauty of a primitive root g modulo a prime p is the appearance of every 
nonzero number modulo p as a power of g. So for any number 1 < a < p, we can 
pick out exactly one of the powers 


—3 —2 =I 


G89 ee eg Sg 


as being congruent to a modulo p. The exponent is called the index of a modulo p 
for the base g. Assuming that p and g have been specified, we write J(a) for the 
index. 

For example, if we use the primitive root 2 as base for the prime 13, then 
I(3) = 4, since 24 = 16 = 3 (mod 13). Similarly, I(5) = 9, since 2° = 512 = 
5 (mod 13). To find the index of any particular number, such as 7, we just compute 
the powers 2, 27, 2°,... modulo 13 until we get to a number that is congruent to 7. 

Another approach is to make a table of all powers of 2 modulo 13. Then we 
can read any information we want from the table. 


7 PPE BP pale 
mots) 2 [a [s/s [ole] fo] s |i [7 | 


Powers of 2 Modulo 13 


For example, to find J(11), we scan the second row of the table until we find the 
number 11, and then the index J(11) = 7 can be read from the first row. 

This suggests another way to arrange the data that might be more useful. What 
we do is rearrange the numbers so that the second row is in numerical order from 1 
to 12, and then we switch the first and second rows. The resulting table has the 
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numbers from 1 to 12 in order in the first row, and below each number is its index. 


Sea ee 


rope lafe[s[afsts pot [o 


Table of Indices Modulo 13 for the Base 2 


Now it’s even easier to read off the index of any given number, such as /(8) = 3 
and J(10) = 10. 

In the past, number theorists compiled tables of indices to be used for numerical 
calculations. The reason that indices are useful for calculations is highlighted by 
the following theorem. 


Theorem 29.1 (Rules for Indices). Indices satisfy the following rules: 
(a) I(ab)=JI(a)+1(b) (modp-—1) [Product Rule] 
(b) JI (a*) = kI(a) (mod p — 1) [Power Rule] 


Proof. These rules are nothing more than the usual laws of exponents, combined 
with the fact that g is a primitive root. Thus, to check (a), we compute 


gi (2) = gb = gl gl) = g!@+1® (mod p). 


This means that g/(2)—1(«)-!() = 1 (mod p). But g is a primitive root, so I(ab) — 
I(a) — I(b) must be a multiple of p — 1. This completes the proof of (a). To 
check (b), we perform a similar computation, 


ak — — a — a 
gl Jagr = (gi! ))k = ght ) (mod p). 
This implies that [(a*) — kI(a) is a multiple of p — 1, which is (b). O 


One of the most common mistakes made when working with indices is to re- 
duce them modulo p instead of modulo p — 1. It is important to keep in mind that 
indices appear as exponents, and the exponent in Fermat’s Little Theorem is p — 1, 


not p. We reiterate: 
Always Reduce Indices 
Modulo p — 1. 


In 1839, Carl Jacobi published a Canon Arithmeticus containing a table of indices for all primes 
less than 1000. More recently, an extensive table containing all primes up to 50021 was compiled by 
Western and Miller and published by the Royal Society at Cambridge University in 1968. 
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I want to explain briefly how the index rules and a table of indices can be used to 
simplify calculations and solve congruences. For that purpose, here is a table of 
indices for the prime p = 37 and the base g = 2. 


a ||19]20/21]22]23/24]25)26/27/28|29/30/31|32/33/34/35|36 
|Z(a)|]35|25)22/31|15]29)10/12| 6 |34]21/14| 9 | 5 |20| 8 |19]18| 


Table of Indices Modulo 37 for the Base 2 


If we want to compute 23-19 (mod 37), rather than multiplying 23 and 19, we can 
instead add their indices. Thus, 


1(23 - 19) = 1(23) + 1(19) = 15 + 35 = 50 = 14 (mod 36). 


Note that the computation is done modulo p — 1, in this case, modulo 36. Looking 
at the table, we find that J(30) = 14 and conclude that 23 - 19 = 30 (mod 37). 
“Wait a minute,” you are probably protesting, “using indices to compute the 
product 23 - 19 (mod 37) is lot of work.” It would be easier to just multiply 23 
by 19, divide the product by 37, and take the remainder. There is a somewhat 
stronger case to be made for using indices to compute powers. For example, 


1(29'*) = 14- I(29) = 14-21 = 294 = 6 (mod 36). 


From the table we see that [(27) = 6, so 29!4 = 27 (mod 37). Here the number 
29'4 has 21 digits, so we wouldn’t want to compute the exact value of 29!4 by 
hand and then reduce modulo 37. On the other hand, we know how to compute 
29'4 (mod 37) quite rapidly using the method of successive squares (Chapter 16). 
So are indices actually useful for anything? The answer is that the real power of a 
table of indices lies not in its use for direct computations, but rather as a tool for 
solving congruences. We give two illustrations. 
For our first example, consider the congruence 


19x = 23 (mod 37). 
If x is a solution, then the index of 19z is equal to the index of 23. Using the 
product rule and taking values from the table of indices, we can compute 
FAG @ )-— h 23) 
I(19) + I(x) = 1(23) (mod 36) 
35 + I(x) = 15 (mod 36) 
I(x) = —20 = 16 (mod 36). 
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Thus, the index of the solution is J(2) = 16, and looking again at the table, we find 
that x = 9 (mod 37). You should compare this solution of 19% = 23 (mod 37) 
with the more cumbersome method described in Chapter 8. Of course, the index 
method won’t work unless you have a table of indices already compiled, so the 
Linear Congruence Theorem in Chapter 8 is certainly not obsolete. 

For our second example we’ll solve a problem that until now would have re- 
quired a great deal of tedious computation. We ask for all solutions to the congru- 
ence 

3x°° = 4 (mod 37). 


We start by taking the index of both sides and using the product and power rules. 


I(3x*°) = I(4) 
I(3) + 302(x) = I(4) (mod 36) 
26 + 301 (x) = 2 (mod 36) 
) 


301 (a) = —24 = 12 (mod 36). 


So we need to solve the congruence 30/(x) = 12 (mod 36) for I(x). [Warning: 
Do not divide both sides by 6 to get 5/(x) = 2 (mod 36), you’ll lose some of the 
answers.] We saw in Chapter 8 how to solve a congruence of this sort. In general, 
the congruence ax = c (mod m) has gcd(a,m) solutions if gcd(a,m) divides c, 
otherwise it has no solutions. In our case gcd(30, 36) = 6 does divide 12, so there 
should be six solutions. Using the methods of Chapter 8, or just by trial and error, 
we find that 


301 (x) = 12 (mod 36) 
for 
I(x) = 4, 10, 16, 22, 28, and 34 (mod 36). 


Finally, we look back at the table of indices to get the corresponding values of x, 


1(16) = 4, 1(25) = 10, 1(9) = 16, 
(Q1y=22,. -H2)228, 1(28) = 34. 


Thus, the congruence 32°° = 4 (mod 37) has six solutions, 
x = 16, 25, 9, 21, 12, 28 (mod 37). 


The computational advantages of using indices are easily stated. The index 
rules convert multiplication into addition and exponentiation into multiplication. 
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This undoubtedly sounds familiar, since it is exactly the same as the rules satisfied 
by logarithms: 


log(ab) = log(a) + log(b) and log(a*) = klog(a). 


For this reason, the index is also known as the discrete logarithm. And just as log- 
arithm tables were used to do computations in days of yore before the proliferation 
of inexpensive calculators, tables of indices were used for computations in num- 
ber theory. Nowadays, with the availability of desktop computers, index tables are 
used less frequently for numerical computations, but indices retain their usefulness 
as a theoretical tool. 

Indeed, in the past few decades the theory of indices has enjoyed a renaissance 
due to its applicability to cryptography, although in the cryptographic community, 
indices are almost always called discrete logarithms. Suppose that you are given a 
large prime number p and two numbers a and g modulo p. The Discrete Logarithm 
Problem (DLP) is the problem of finding the exponent & such that 


g* =a (mod p). 

In other words, the discrete logarithm problem asks you to find the index of a 
modulo p for the base g. As we saw in Chapter 16, it is relatively easy to compute 
g* (mod p) if you know g and k. However, if p is large, it is quite difficult to 
find the value of k if you’re given the value of g* (mod p). This dichotomy can 
be used to construct public key cryptosystems, much in the way that we used the 
difficulty of factoring numbers to construct the RSA cryptosystem in Chapter 18. 
Exercise 29.6 contains a description of a discrete logarithm-based cryptosystem 
called the ElGamal cryptosystem. 


Exercises 

29.1. Use the table of indices modulo 37 to find all solutions to the following congruences. 
(a) 12a = 23 (mod 37) (ec) 2«!2 =11 (mod 37) 
(b) 5x73 = 18 (mod 37) (d) 7x?° = 34 (mod 37) 


29.2. (a) Create a table of indices modulo 17 using the primitive root 3. 
(b) Use your table to solve the congruence 4% = 11 (mod 17). 
(c) Use your table to find all solutions to the congruence 52° = 7 (mod 17). 


29.3. (a) If a and 6 satisfy the relation ab = 1 (mod p), how are the indices (a) and [(b) 
related to one another? 

(b) If a and 0 satisfy the relation a + b = 0 (mod p), how are the indices I(a) and I(b) 
related to one another? 
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(c) If a and b satisfy the relation a + b = 1 (mod p), how are the indices [(a) and I(b) 
related to one another? 


29.4. (a) If k divides p — 1, show that the congruence x* = 1 (mod p) has exactly k 
distinct solutions modulo p. 
(b) More generally, consider the congruence 


a* =a (mod p). 


Find a simple way to use the values of k, p, and the index J(a) to determine how 
many solutions this congruence has. 

(c) The number 3 is a primitive root modulo the prime 1987. How many solutions are 
there to the congruence x!!! = 729 (mod 1987)? [Hint. 729 = 3°.] 


29.5. @ Write a program that takes as input a prime p, a primitive root g for p, and a 
number a, and produces as output the index J(a). Use your program to make a table of 
indices for the prime p = 47 and the primitive root g = 5. 


29.6. In this exercise we describe a public key cryptosystem called the ElGamal Cryp- 
tosystem that is based on the difficulty of solving the discrete logarithm problem. Let p be 
a large prime number and let g be a primitive root modulo p. Here’s how Alice creates a 
key and Bob sends Alice a message. 

The first step is for Alice to choose a number k to be her secret key. She computes the 
number a = g* (mod p). She publishes this number a, which is the public key that Bob 
(or anyone else) will use to send her messages. 

Now suppose that Bob wants to send Alice the message m, where m is a number 
between 2 and p — 1. He randomly chooses a number r and computes the two numbers 


€; = g" (mod p) and €2 = ma’ (mod p). 


Bob sends Alice the pair of numbers (€1, €2). 

Finally, Alice needs to decrypt the message. She first uses her secret key k to com- 
pute c = e* (mod p). Next she computes u = c~! (mod p). [That is, she solves cu = 
1 (mod p) for u, using the method in Chapter 8.] Finally, she computes v = ueg (mod p). 


We can summarize Alice’s computation by the formula 
v = eg: (ef)! (mod p). 


(a) Show that when Alice finishes her computation the number v that she computes 
equals Bob’s message m. 

(b) Show that if someone knows how to solve the discrete logarithm problem for the 
prime p and base g then he or she can read Bob’s message. 


29.7. @ For this exercise, use the ElIGamal cryptosystem described in Exercise 29.6. 
(a) Bob wants to use Alice’s public key a = 22695 for the prime p = 163841 and base 
g = 3 to send her the message m = 39828. He chooses to use the random number 
r = 129381. Compute the encrypted message (e€1, €2) he should send to Alice. 
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(b) Suppose that Bob sends the same message to Alice, but he chooses a different value 
for r. Will the encrypted message be the same? 

(c) Alice has chosen the secret key k = 278374 for the prime p = 380803 and the base 
g = 2. She receives a message (consisting of three message blocks) 


(61745, 206881), (255836, 314674), (108147, 350768) 


from Bob. Decrypt the message and convert it to letters using the number-to-letter 
conversion table in Chapter 18. 


Chapter 30 


The Equation X4+ y4 = Z* 


Fermat’s Last Theorem says that if n > 3, then the equation 


gl” + y” — agit 

has no solutions in positive integers z, y, and z. Fermat scribbled this statement 
in the margin of his copy of Diophantus’ Arithmetica sometime during the middle 
of the seventeenth century, but it wasn’t until the end of the twentieth century that 
Andrew Wiles gave the first definitive proof. In this chapter we describe Fermat’s 
proof for the particular exponent n = 4. In fact, we prove the following stronger 
statement. 


Theorem 30.1 (Fermat’s Last Theorem for Exponent 4). The equation 


ot 4+ yt = 2? 


has no solutions in positive integers x, y, and z. 


Proof. We use Fermat’s method of descent to prove this theorem. Recall that the 
idea of “descent,” as used in Chapter 24 to write a prime as a sum of two squares, 
is to descend from a large solution to a small solution. How does that help us in 
this instance, since we’re trying to show that there aren’t any solutions at all? 

What we do is to suppose that there is a solution (x, y, z) in positive integers, 
and we use this supposed solution to produce a new solution (X, Y, Z) in positive 
integers with Z < z. Repeating this process, we would end up with a never-ending 
list of solutions 


(21, Y1, 21); (@2, Yo, 22), (43, Y3, 23), +++ Withee [> Zoe 2a, er 


This is, of course, completely absurd, since a decreasing list of positive integers 
can’t continue indefinitely. The only escape from this absurdity lies in our original 
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assumption that there is a solution. In other words, this contradiction shows that 
no solutions exist. 

Now for the nitty-gritty details. We assume that we are given a solution (z, y, z) 
to the equation 

gt ty? = 22, 

and we want to find a new smaller solution. If x, y, and z have a common factor, 
then we can factor it out and cancel it, so we may as well assume that they are 
relatively prime. Next, we observe that if we leta = 2”, b = y?, andc = z 
then (a, b, c) is a primitive Pythagorean triple, 


G+ F =e. 


We know from Chapter 2 what all primitive Pythagorean triples look like. Possibly 
after switching x and y, there are odd integers s and ¢ such that 
eet a so +t? 


C= oe si, y? =b= ae z=c= ; 


Notice that the product st is odd and equal to a square and that the only squares 
modulo 4 are 0 and 1, so we must have 


st = 1 (mod 4). 


This means that s and ¢ are either both 1 modulo 4 or both 3 modulo 4. In any case, 
we see that 
s =t (mod 4). 


Next we look at the equation 
Qy? = s* —t* = (8s —t)(s +2). 


The fact that s and ¢ are odd and relatively prime means that the only common fac- 
tor of s — t and s + tis 2. We also know that s — ¢ is divisible by 4, so s + ¢ must 
be twice an odd number. Furthermore, we know that the product (s — t)(s + t) is 
twice a square. The only way this can happen is if we have 


s+t=2u? and s—t=4v" 


for some integers with u and 2v relatively prime. 
We solve for s and ¢ in terms of u and v, 


6 =u? 4 2v? and t=u? aS Dye. 
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and substitute into the formula x? = st to get 
a? = u* — 44. 


This can be rearranged to read 


x + Ay* = u4. 


Unfortunately, this isn’t quite the equation we’re looking for, so we repeat the 

process. If we let A= =, B = Qu’, and C = u?, then 
A? + B? =C?, 
so (A, B,C) is a primitive Pythagorean triple. Again referring to Chapter 2, we 
can find odd relatively prime integers S and T’ so that 
2_ 72 2:4 G2 
$2467. 30 Sp=2 = oferta! 

2 2 

The middle formula says that 
dy? = $7 -T? =(S—T\W(S+T). 


Now S and T are odd and relatively prime, so the greatest common divisor of 
S —T and $+ T is 2. Furthermore, their product is a square, so it must be true 
that 

StPS2x%"- and STS 2¥7 


for some numbers X and Y. Solving for S and T' in terms of X and Y gives 
Sax? ey? and) “Pax y2 
and then substituting into the formula for u? yields 


21 m2 Deepen? a Co, 
ae a a a So) mes ee) —~xtyyt. 


Voila! We have a new solution (X, Y, wv) to our original equation 


pt ay? = 27. 


It only remains to verify that the new solution is smaller than the original one. 
Using various formulas from above, we find that 


2 2 2 2\2 2 2\2 
t 2 —2 
yee Z = (ee aul dt 


This makes it clear that u is smaller than z. [ia] 
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Exercises 


30.1. Show that the equation y? = x° + xz* has no solutions in nonzero integers z, y, z. 
[Hint. Suppose that there is a solution. First show that it can be reduced to a solution 
satisfying gcd(z, z) = 1. Then use the fact that x? + xz4 = x(x? + z*) is a perfect square 
to show that there are no solutions other than 7 = y = 0.] 


30.2. A Markoff triple is a triple of positive integers (2, y, z) that satisfies the Markoff 
equation 
a? + y? 4+ 27 = 8xyz. 
There is one obvious Markoff triple, namely (1,1, 1). 
(a) Find all Markoff triples that satisfy x = y. 
(b) Let (xo, yo, Zo) be a Markoff triple. Show that the following are also Markoff triples: 


F (20, Yo, Zo) = (Xo, Z0, 32020 i Yo)s 
G(2o, Yo, 20) = (Yo, 20, 3Y0Z0 — Zo), 
H (20, Yo, 20) = (Zo, Yo: 3¥0Yo — 20): 


This gives a way to create new Markoff triples from old ones. 

(c) Starting with the Markoff triple (1,1,1), repeatedly apply the functions / and G 
described in (b) to create at least eight more Markoff triples. Arrange them in a 
picture with two Markoff triples connected by a line segment if one is obtained from 
the other by using F' or G. 


30.3. This exercise continues the study of the Markoff equation from Exercise 30.2. 
(a) It is clear from the form of the Markoff equation that if (x9, yo, Zo) is a Markoff 
triple, then so are all of the triples obtained by permuting its coordinates. We say that 
a Markoff triple (xo, yo, 29) is normalized if its coordinates are arranged in increasing 
order of magnitude, 
Lo S yo S 20- 


Prove that if (0, yo, Zo) is a normalized Markoff triple, then both F'(xo, yo, 20) and 
G(x0, Yo, Zo) are normalized Markoff triples. 
(b) The size of a Markoff triple (xo, yo, Zo) is defined to be the sum of its coordinates, 


size(ro, yo, 20) = Zo + Yo + 20. 
Prove that if (29, yo, Zo) is a normalized Markoff triple, then 


size(Xo, Yo; Zo) < size F (20, Yo; Zo), 
size(xo, Yo, 20) < size G(xo, yo, 20), 
size(xo, Yo, 20) > size H (Xo, yo, 20): 
[Hint. For the inequality for H, use the quadratic formula to solve the Markoff equa- 


tion for 2% in terms of 79 and yo. Show that the assumption zo < yo < Zo forces us 
to take the plus sign in the quadratic formula.] 


[Chap. 30] The Equation X*+Y4=Z4 235 


(c) Prove that every Markoff triple can be obtained by starting with the Markoff triple 
(1,1,1) and repeatedly applying the functions F' and G. [Hint. If (20, yo, Zo) is a 
normalized Markoff triple not equal to (1,1,1) or (1,1,2), apply the map H and 
rearrange the coordinates to get a normalized Markoff triple (21, y1, 21) of strictly 
smaller size such that one of F'(x1, yi, 21) or G(x1, y1, 21) is equal to (x9, yo, Zo0)-] 


Chapter 31 


Square—Triangular Numbers 
Revisited 


Some numbers are “shapely” in that they can be laid out in some sort of regular 
shape. For example, a square number n? can be arranged in the shape of an n- 
by-n square. Similarly, a triangular number is a number that can be arranged in 
the shape of a triangle. The following picture illustrates the first few triangular and 
Square numbers (other than 1). 

e @ @ 


@e38 @ 
L223 14+2+3=6 14+24+3+4=10 
Triangular Numbers 


e@ @ ee @ eee @ 
e@ @ ee @ e@ee @ 
ee @ eee @ 

eee @ 

97 — 4 37 = 9 4* = 16 


Square Numbers 


Triangular numbers are thus formed by adding 
Lp Zadar een 


for different values of m. We found a formula for the m triangular number in 
Chapter 1, 
m(m + 1) 


Lea 
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Here’s a list of the first few triangular and square numbers. 


Triangular Numbers 1, 3,6, 10, 15, 21, 28, 36, 45, 55, 66, 78, 91, 105 
Square Numbers 1,4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169 


In Chapter 1 we posed the question of “Squaring the Triangle,” that is, finding 
square numbers that are also triangular numbers. Even our short list reveals two 
examples, 1 (which isn’t very interesting) and 36. This means that 36 pebbles 
can be arranged in the shape of a 6-by-6 square, and they can also be arranged in 
the shape of a triangle with 8 rows. An exercise in Chapter 1 asked you to find 
one or two more examples of these square—triangular numbers and to think about 
the question of how many there are. Using the mathematical sophistication we’ve 
gained in the subsequent chapters, we’re now going to develop a method for finding 
all square-triangular numbers. 

Triangular numbers look like m(m + 1)/2 and square numbers look like n?, 
sO square—triangular numbers are solutions to the equation 


1 
a mn ) 


with positive integers n and m. If we multiply both sides by 8, we can do a little 
algebra to get 
8n? = 4m? + 4m = (2m +1)? — 1. 


This suggests that we make the substitution 
z=22m+4+1 and y= 2 


to get the equation 
y" = ee 


which we rearrange into the form 
po 2" — a 


Solutions to this equation give square—triangular numbers with 


m= : and z 
= a 
2 2 
By trial and error we notice one solution, (z,y) = (3,2), which gives the 
square—triangular number (m,n) = (1,1). With a little more experimentation 


(or using the fact that 36 is square-triangular), we find another solution (x, y) = 
(17,12) corresponding to (m,n) = (8,6). Using a computer, we can search 
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for more solutions by substituting y = 1,2,3,... and checking if 1 + 2y? is a 
square. The next solution found is (x, y) = (99, 70), which gives us a new square— 
triangular number with (m,n) = (49,35). In other words, 1225 is a square— 
triangular number, since 


35° = 1225 =14+243+4---+48+4 49. 
What tools can we use to solve the equation 
a? —2y?=1? 


One method we’ve used repeatedly in the past is factorization. Unfortunately, 
x? — 2y? does not factor if we stay within the realm of whole numbers; but if 
we expand our horizons a little, it does factor as 


a? — Qy? = (x + yv2) (x -yv2). 
For example, our solution (x, y) = (3, 2) can be written as 
1 =3?-2.2? = (3+2v2) (3-2v2). 


Now see what happens if we square the left- and right-hand sides of this equa- 
tion. 


(== (3+2v2) (3- 2v2) 
= (17 + 12v2) (17 - 12v2) 
= 17? — 2.12" 
So by “squaring” the solution (xz, y) = (3, 2), we have constructed the next solution 
(27) (2 


This process can be repeated to find more solutions. Thus, cubing the (x, y) = 
(3, 2) solution gives 


ee (3+2v2) (3- 2v2). 
= (99 a 70v2) (99 = 70v2) 
= 997 — 2.707, 
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and taking the fourth power gives 


ie (3+2v2). (3 -2v2)" 
a (57 a 4082) (57 = 4082) 
= 5777 — 2-408. 


Notice that the fourth power gives us a new square-triangular number, (m,n) = 
(288, 204). When doing computations of this sort, it’s not necessary to raise the 
original solution to a large power. Instead, we can just multiply the original solution 
by the current one to get the next one. Thus, to find the fifth-power solution, we 
multiply the original solution 3 + 2/2 by the fourth-power solution 577 + 4082. 
This gives 

(3 za 2V2) (57 ih 4082) = 3363 + 2378V2, 


and from this we read off the fifth-power solution (x, y) = (3363, 2378). Contin- 
uing in this fashion, we can construct a list of square—-triangular numbers. 


. . a noe m(m + 1) 

2 
é) Z I 1 
17 12 6 36 
99 70 33 1225 
S77 408 288 204 41616 
3363 2378 1681 1189 1413721 
19601 13860 9800 6930 48024900 
114243 80782 Sv121 40391 1631432881 
665857 | 470832 | 332928 | 235416 55420693056 


As you see, these square—triangular numbers get quite large. 
By raising 3 + 21/2 to higher and higher powers, we can find more and more 
solutions to the equation 
eo = 2y" ls 


which gives us an inexhaustable supply of square—-triangular numbers. Thus, there 
are infinitely many square-triangular numbers, which answers our original ques- 
tion, but now we ask if this procedure actually produces all of them. The answer is 
that it does, and you won’t be surprised to learn that we use a descent argument to 
verify this fact. 
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Theorem 31.1 (Square—Triangular Number Theorem). (a) Every solution in pos- 
itive integers to the equation 
eee 


is obtained by raising 3 + 2,/2 to powers. That is, the solutions (xx, yx) can all be 
found by multiplying out 
k 
ay + yeV2 = (3 + 2V2) SOP Re Vere 8 irae 


(b) Every square—triangular number n? = 5m(m +1) is given by 


—1 
m= n= JOR le, oo 8y 


where the (xx, yx)’s are the solutions from (a). 


Proof. The only thing we have left to check is that if (u,v) is any solution to 
f= 2Qy? = 1, then it comes from a power of the solution (3, 2). In other words, 
we must show that 


k 
utvv2= (3 + 2v2) for some k. 


We prove this by the method of descent. Here’s the plan. If u = 3, then we must 
have v = 2, so there’s really nothing to check. So we suppose that u > 3, and we 
show that there is then another solution (s, ¢) in positive integers such that 


ut+ov2= (3 + 2v2) (s + tv2) and Pe: 


Why does this help? Well, if (s,t) = (3,2), then we’re done; otherwise, s 
must be larger than 3, so we can do the same thing starting from (s, t) to find a new 
solution (q,7) with 


s+tv2 = (3+2v2) (q+ rv2) and q<s. 


This means that 


er Oe (8+2v2) (q+rv2). 


Now if (q,r) = (3,2), we’re done, and if not, then we apply the procedure yet 
again. Continuing in this fashion, we observe that this process cannot go on forever, 
since each time we get a new solution, the value of x is smaller. But these values 
are all positive integers, so they cannot keep getting smaller indefinitely. Therefore, 
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eventually we get (3, 2) as a solution, which means that eventually we end up with 
u+ v2 written as a power of 3 + 2/2. 

So now we begin with a solution (u,v) with wu > 3, and we are looking for a 
solution (s,t) with the property 


utvv2 = (3+2v2) (s+tv2) and SU, 

Multiplying out the right-hand side of the equation, we need to solve 

ut vvV2 = (3s + 4t) + (25 + 3t)V2 
for s and t. In other words, we need to solve 

u = 3s + 4t and v = 28+ 3t. 
This is done easily, the answer being 

s = 3u—4v and t= —2u+ 3v. 
Let’s check that this (s, t) really gives a solution. 


s* — 2t7 = (3u — 4v)* — 2(—2u 4+ 3v)? 
= (9u? — 24uv + 16v7) — 2(4u? — 12uv + 9v?) 
= yu? — 2v? 
=i 
since we know that (u,v) is a solution. So that’s fine. There are two more things 
we need to check. First, we need to check that s and ¢ are both positive. Second, 
we must verify that s < u, since we want the new solution to be “smaller” than the 


original solution. 
It’s easy to see that s is positive using the fact that 


u2 =1+2v? > 2v2, which tells us that u> /2v. 


Then 
$= 3u—4v > 3V2u — 4v = (3V2—4) v > 0, 


since 31/2 ~ 4.242 is greater than 4. 
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The proof that ¢ is positive is a little trickier. Here’s one way to do it: 


u>3 We assumed this. 
u2>9 Square both sides. 
Qu? > 9 + 8u? Add 8u? to both sides. 
9u2 —9 > 8u? Move the 9 to the other side. 
u*—1> 8u* Divide both sides by 9. 
Qu? > Sy? Since we know that u? — 2u? = 1. 
v> 2u Divide by 2 and take square roots. 


Using this last inequality, it is now easy to check that t¢ is positive. 
2 
t= —2u+3u > SOS 


We now know that s and ¢ are positive, from which it follows that s < u, since 
u = 3s + 4t. This completes our proof that the descent process works and thus 
completes our proof of the Square—Triangular Number Theorem. L 


The Square—Triangular Number Theorem says that every solution (x, yx) in 
positive integers to the equation 


a? — 2Qy* = 1 


can be obtained by multiplying out 


k 
ry + yxV2 = (3 + 2V2) fork = 1,2,3,.... 


The table at the beginning of this chapter makes it clear that the size of the solutions 
grows very rapidly as & increases. We’d like to get a more precise idea of just how 
large the k™ solution is. To do this, we note that the preceding formula is still 
correct if we replace /2 by —4/2. In other words, it’s also true that 


k 
rn — yxV2 = (3 - 2v2) fore 


Now if we add these two formulas together and divide by 2, we obtain a formula 
for Xx: 

k k 
(3 + 2/2)" + (3 — 2V2) 


Lk = 9 
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Similarly, if we subtract the second formula from the first and divide by 2/2, we 
get a formula for yz: 


_ (es82)"— 6 ayy" 
<= ii° 


These formulas for x, and y;, are useful because 
34+2V/2~%5.82843 and 3-272 0.17157. 


The fact that 3 — 2,/2 is less than 1 means that when we take a large power of 
3 — 2\/2, we’ll get a very tiny number. For example, 


10 
(3 = 2V/2) ~ 0.0000000221, 


sO 

ie 
——.——  ® 22619536.99999998895 and 
~ 15994428.000000007815. 


But we know that x19 and 9 are integers, so the 10** solution is 
(X10, Y10) = (22619537; 15994428). 


Using this we find that the 10" square-triangular number n? = m(m + 1)/2 is 
given by 
n = 7997214 and m = 11309768. 
It’s also apparent from the formulas for x, and y; why the solutions grow so 


rapidly, since 


(5.82843)* — and (5.82843)*, 


Nl Re 


it 
Lk ~ Yk ~ > 
2/2 
Thus, each successive solution is more than five times as large as the previous 
one. Mathematically, we say that the size of the solutions grows exponentially. 
Later, when we study elliptic curves in Chapter 41, we’ll see some equations whose 
solutions grow even faster than this! 


[Chap. 31] Square—Triangular Numbers Revisited 244 


Exercises 


31.1. Find four solutions in positive integers to the equation 
x? — By? =. 
[Hint. Use trial and error to find a small solution (a, b) and then take powers of a + bV5.] 
31.2. (a) In Chapters 24 and 25 we studied which numbers can be written as sums of two 
squares. Compile some data and try to make a conjecture as to which numbers can 
be written as sums of (one or) two triangular numbers. For example, 7 = 1 + 6 and 
25 = 10 + 15 are sums of two triangular numbers, while 19 is not. 


(b) Prove that your conjecture in (a) is correct. 
(c) Which numbers can be written as sums of one, two, or three triangular numbers? 


31.3. (a) Let (rx, yz) for k = 0,1,2,3,... be the solutions to x? — 2y? = 1 described 
in Theorem 31.1. Fill in the blanks with positive numbers such that the following 
formulas are true. Then prove that the formulas are correct. 


Per = — Le + —Yx and Yk41 = —_UE+—_ Ye. 
(b 


— 


Fill in the blanks with positive numbers such that the following statement is true: 
If (m,n) gives a square—triangular number, that is, if the pair (m,n) satisfies the 
formula n? = m(m + 1)/2, then 


(1+__m+__n,1+__m+__n) 


also gives a square—triangular number. 
(c) If L is a squaretriangular number, explain why 1 + 172 +6/L + 807? is the next 
largest square—-triangular number. 


31.4. A number n is called a pentagonal number if n pebbles can be arranged in the 
shape of a (filled in) pentagon. The first four pentagonal numbers are 1, 5, 12, and 22, as 
illustrated in Figure 31.1. You should visualize each pentagon as sitting inside the next 
larger pentagon. The n™ pentagonal number is formed using an outer pentagon whose 


sides have n pebbles. 
: 5 12 22 


Figure 31.1: The First Four Pentagonal Numbers 


(a) Draw a picture for the fifth pentagonal number. 
(b) Figure out the pattern and find a simple formula for the n™ pentagonal number. 
(c) What is the 10 pentagonal number? What is the 100" pentagonal number? 


Chapter 32 


Pell’s Equation 


In the last chapter we gave a complete description of the solutions to the equation 
a —Qy? =1 in positive integers x and y. 


This is an example of what is called a Pell equation, which is an equation of the 
form 
g’* — Dy* =1, 


where D is a fixed positive integer that is not a perfect square. 

Pell’s equation has a long and fascinating history. Its first recorded appearance 
is in the “Cattle problem of Archimedes.” This problem involves eight different 
kinds of cattle and asks the reader to determine how many there are of each kind. 
Various linear relations are given, together with two nonlinear conditions, one spec- 
ifying that a certain quantity is a square and the other saying that a certain quantity 
is a triangular number. After a lot of algebra, the problem finally reduces to solving 
the Pell equation 

a? — 4729494y* = 1. 


The y coordinate of the smallest solution, which was first determined by Amthor 
in 1880, has 41 digits, and then the answer to the original cattle problem has hun- 
dreds of thousands of digits! It seems unlikely that Archimedes or his contem- 
poraries could have determined the solution, but it is fascinating that they even 
thought to pose such a problem. 

Fast-forwarding through the centuries, the first significant progress in solv- 
ing Pell’s equation was made in India. As early as AD 628, Brahmagupta de- 
scribed how to use known solutions to Pell’s equation to create new solutions, 
and in AD 1150 Bhaskaracharya gave an ingenious method, with a surprisingly 
modern flavor, for finding an initial solution. Unfortunately, this groundbreaking 
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work remained unknown in Europe until long after it had been rediscovered and 
superceded during the seventeenth century. 


Brahmagupta (598-670) One of India’s most famous mathematicians of his 
era, Brahmagupta’s best known work is the Brahmasphutasiddhanta (The 
Opening of the Universe) written in AD 628. This extraordinary book in- 
cludes a discussion of equations of the form x? — Dy? = A, and in particu- 
lar “Pell’s” equation x? — Dy? = 1. Brahmagupta describes a composition 
method for creating new solutions from old ones, which he calls samasa, and 
he gives an algorithm that (sometimes) produces an initial solution. 

Approximately 500 years later the Indian mathematician Bhaskaracharya 
(AD 1114-1185) extended Brahmagupta’s work on “Pell’s” equation by de- 
scribing a method that uses an initial approximate solution to find a true solu- 
tion via repeated reductions. Bhaskaracharya called his method chakravala; 
today arguments of this type go by the name “Fermat descent.” We saw ex- 
amples of Fermat descent in Chapters 24 and 30. Bhaskara illustrates his 
method by solving x? — 61y? = 1 more than 500 years before Fermat used 
this equation to issue a challenge. 


The modern European history of Pell’s equation begins in 1657 when Fermat 
challenged his fellow mathematicians to solve the equation 7? — 6ly? = 1. Sev- 
eral of them found the smallest solution, which is 


(x, y) = (1766319049, 226153980), 


and in 1657 William Brouncker described a general method for solving Pell’s equa- 
tion. Brouncker demonstrated the efficiency of his method by finding, in just a 
couple of hours, the smallest nontrivial solution 


(32188120829134849, 1819380158564160) 
to the equation 
e = 31397 = 1 


J. Wallis described Brouncker’s method in a book on algebra and number theory, 
and Wallis and Fermat both asserted that Pell’s equation always has a solution. 
Euler mistakenly thought that the method in Wallis’s book was due to John Pell, 
another English mathematician, and it is Euler who assigned the equation the name 
by which it has since been known. Of such misapprehensions is mathematical 
immortality attained!! 


‘Some are born great, some achieve greatness, and some have mathematical greatness thrust 
upon them. With the benefit of historical hindsight, a better name for “Pell’s equation” might be the 
“B® equation,” in honor of the three mathematicians Brahmagupta, Bhaskaracharya, and Brouncker. 
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Suppose that we are able to find a solution (x1, y;) to the Pell equation 
a? — Dy” = 1. 


Then we can produce new solutions using the same method described in the last 
chapter for D = 2. Factoring the known solution as 


Le x — Dy? = @ + yvD) @ - nvD) : 


we square both sides to get a new solution 


2 2 
1=1= («1 ate ywvD) G — ywvD) 
= ((2} + yD) + 2armvD) (a? + y2D) - 2x1y:VD) 
= (a? + y?D)* — (24, y1)7D. 


In other words, (x? + y2D, 2x1y1) is a new solution. Taking the third power, 
fourth power, and so on, we can continue to find as many more additional solu- 
tions as we desire. 

This leaves two vexing questions. First, does every Pell equation have a so- 
lution? Note that this question didn’t arise when we studied the Pell equation 
x* — 2y* = 1, since for this specific equation it was easy to find the solution (3, 2). 
Second, assuming that a given Pell equation does have a solution, is it true that ev- 
ery solution can be found by taking powers of the smallest solution? For the equa- 
tion x? — 2y? = 1 we showed that this is true; every solution comes from powers 
of 3+ 2/2. The answers to both of these questions are given in the following 
theorem. 


Theorem 32.1 (Pell’s Equation Theorem). Let D be a positive integer that is not 
a perfect square. Then Pell’s equation 


a”? — Dy? =1 


always has solutions in positive integers. If (x1, y1) is the solution with smallest x1, 
then every solution (xx, yx) can be obtained by taking powers 


k 
my + yxWD = (21 + VD) JOD Wi Neate 5 
For example, the smallest solution to the Pell equation 


a? —A7y* =1 
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is (x,y) = (48,7). Then all solutions can be obtained by taking powers of 
48 + 7./47. The second and third smallest solutions are 


2 

(48 a 7/47) — 4607 + 672,47 and 
3 

(48 Ht 7/47) = 442224 + 64505V47. 


The second part of Pell’s Equation Theorem, which says that every solution to 
Pell’s equation is a power of the smallest solution, is actually not too difficult to 
verify. It can be proved for arbitrary values of D in much the same way that we 
proved it for D = 2 in the previous chapter. The first part, however, which asserts 
that there is always at least one solution, is somewhat more difficult. We postpone 
the proof of both parts until Chapter 34. 

Table 32.1 lists the smallest solution to Pell’s equation for all D up to 75. As 
you can see, sometimes the smallest solution is quite small. For example, the 
equation +? — 72y? = 1 has the comparatively tiny solution (17,2), as does the 
equation x” — 75y” = 1 with small solution (26, 3). On the other hand, sometimes 
the smallest solution is huge. Striking examples in the table include 


a? —6ly? =1 with smallest solution (1766319049, 226153980), 


and 
x”? — 73y? =1 with smallest solution (2281249, 267000). 


Another example of this phenomenon is given by 
a? —97y? =1 with smallest solution (62809633, 6377352), 


and of course there’s the equation x? — 313y? = 1, already mentioned, which has 
a similarly spectacular smallest solution. 

There is no known pattern as to when the smallest solution is actually small 
and when it is large. It is known that the smallest solution (x,y) to x? — Dy? = 1 
is no larger than x < 2”, but obviously this is not a very good estimate.” Maybe 
you'll be able to discern a pattern that no one else has noticed and use it to prove 
hitherto unknown properties of the solutions to Pell’s equation. 


There is a more precise bound for the smallest solution (x, y) that is due to C. L. Siegel. He 
showed that for each D there is a positive integer h such that the number h - log(x + yD) has the 
same order of magnitude as VD. In particular, log(x) and log(y) won’t be much larger than some 
multiple of VD. So, for x and y to be small, this mysterious number h (which is called the class 
number for D) needs to be large. There are many unsolved problems concerning the class number, 
including the famous conjecture that there are infinitely many D’s whose class number equals 1. 
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x Y 
50 c 
649 90 
66249 9100 
485 66 
89 12 
15 2 
151 20 
19603 2574 
530 69 
31 4 
1766319049 | 226153980 
63 8 
8 1 
129 16 
65 8 
48842 5967 
33 A 
7775 936 
251 30 
3480 413 
ili 2 
2281249 267000 
3699 430 
26 3 


Table 32.1: The Smallest Solution to the Pell Equation x? — Dy? = 1 


Exercises 


32.1. A Pell equation is an equation x? — Dy* = 1, where D is a positive integer that is 
not a perfect square. Can you figure out why we do not want D to be a perfect square? 
Suppose that D is a perfect square, say D = A”. Can you describe the integer solutions of 
the equation x? — A?y? = 1? 


32.2. Find a solution to the Pell equation x? — 22y? = 1 whose z is larger than 10°. 


32.3. Prove that every solution to the Pell equation x? — 1ly? = 1 is obtained by taking 
powers of 10 + 3/11. (Do not just quote the Pell Equation Theorem. I want you to 
give a proof for this equation using the same ideas that we used to handle the equation 
x? — 2y? = 1 in Chapter 31.) 
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32.4. We continue our study of the pentagonal numbers described in Exercise 31.4. 
(a) Are there any pentagonal numbers (aside from 1) that are also triangular numbers? 
Are there infinitely many? 
(b) Are there any pentagonal numbers (aside from 1) that are also square numbers? Are 
there infinitely many? 
(c) Are there any numbers, aside from 1, that are simultaneously triangular, square, and 
pentagonal? Are there infinitely many? 


Chapter 33 


Diophantine Approximation 


How might we go about finding a solution to Pell’s equation 
a’ — Dy* =1 
in positive integers x and y? The factorization 


(x -yvD) (x +yvD) =|] 


expresses the number 1 as the product of two numbers, one of which is fairly large. 
More precisely, the number x + yD is large, especially if x and y are large, so 


the other factor \ 
os 
er yVD 


must be rather small. 
We capitalize on this observation by investigating the following question: 


How small can we make x — yD? 


If we can find integers x and y that make x — yD very small, we might hope 
that x and y give a solution to Pell’s equation.! For the remainder of this chapter 
we concentrate on giving Lejeune Dirichlet’s beautiful solution to this problem. 
We return to Pell’s equation in the next chapter. 


Unfortunately, as happens so often in life, our hopes are dashed when it turns out that x and y 
only give a solution to a “Pell-like” equation x? — Dy? = M. Don’t despair. Turning sorrow into 
joy, we will be able to take two carefully chosen solutions to 7? — Dy? = M and transform them 
miraculously into the sought after solution to 7? — Dy? = 1. 
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Let’s begin with the easiest answer to our question. For any positive integer y, 
if we take z to be the integer closest to the number yv D, then the difference 


1 
[ =o yVD| is at most 5° 


This is true because any real number lies between two integers, so its distance to 
the nearest integer is at most 5: 

Can we do better? Here is a brief table for /13. For each integer y from 1 
to 40, we have listed the integer x that is closest to y/13, together with the values 
of |x — yv 13| and x? — 13y?. 


[ve wis 
4] 11/ 0.394449 
7 | or Sees 


— 
— 
Ww 


0.183346 4.000 


14/ 4] 0.422205 

18] 5] 0.027756 —1.000 0.138782 
| 22| 6| 0.366692 16.000 | | 0.255667 
25 | 7/| 0.238859 | —12.000 0.349884 
32] 9| 0.449961 

36/10] 0.055513 —4.000 0.166538 
40/11] 0.338936 27.000 0.227910 


AD 0.266615 | —23.000 | 115 | 32 0.377641 | —87.000 
AT | i8 0.127833 12.000 0.016808 4.000 
0114] 0.477718 | —48.000 0.411257 | 101.000 
ANS 


4) 
5 0.083269 —9.000 


58 | 1 0.311180 36.000 0.200154 


6 
61) |Z 0.294372 | —36.000 0.405397 | —108.000 
65 | 18 0.100077 13.000 0.010948 —3.000 


69 | 19 0.494526 68.000 Eg 0.383500 | 108.000 


72320 0.111026 | —16.000 0.222051 | —64.000 


Notice that |x —yv 13| is always less than 1/2, just as we predicted. Some- 
times it is close to 1/2, as happens for y = 19 and y = 24, but sometimes it is 
much smaller. For example, there are four instances in the table for which it is 
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smaller than 0.05: 


18, 5), poy 138 0.027756,. 2e2 sy 1, 
101,28), |x —yv13] = 0.044564, =? — 13y? = 9, 
119,33), |x — yV/13| = 0.016808, x? — 13y? = 4, 
= (137,38), |w — yV13| = 0.010948, a” — 13y? = —3. 


If we extend the table up to y = 200, we find that all the following pairs (x, y) 
satisfy |x — yv/13| < 0.05: 


(18,5), (101, 28), (119, 33), (137, 38), (155, 43), (238, 66), (256, 71), 
(274, 76), (292, 81), (375, 104), (393, 109), (411, 114), (494, 137), 
(512, 142), (530, 147), (548, 152), (631, 175), (649, 180), (667, 185). 


Do you see a pattern? Well, I don’t either. 
Since there doesn’t seem to be any obvious pattern, we take a different ap- 
proach to making |x —yv 13| small. The method that we use is called 


The Pigeonhole Principle 


This marvelous principle says that if you have more pigeons than pigeonholes, then 
at least one of the pigeonholes contains more than one pigeon!” Although seem- 
ingly obvious and trivial, the proper application of this principle yields a bountiful 
mathematical harvest. 

What we are going to do is look for two different multiples y; VD and y2VD 
whose difference is very close to a whole number. To do this, we pick some large 
number Y and consider all the multiples 


OV DAW.D 27D. 3 Dyce VD: 


We write each of these multiples as the sum of a whole number and a decimal 


*The Pigeonhole Principle, so called while residing in town, often Bunburies* in the country 
under the name of the Box Principle or the Schubfachschlu8. The Box Principle asserts that if there 
are more objects than boxes, then some box contains at least two objects. Many consider Boxes 
to be dull when compared to Pigeonholes, while the Germanic Schubfachschlu8 sounds thoroughly 
respectable and, indeed, I believe is so.** 

* The art and artifice of Bunburying is fully explained by Algernon Montcrieff in Act I of Oscar Wilde’s The Importance of Being Earnest. 


** See Algernon’s Aunt Augusta (ibid.) for more on the merits of the German language. 
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between 0 and 1, 


OVD =No+ Fo with No = 0 and Fy = 0. 

iV/D = N,+F, with N, an integer and 0 < F; < 1. 
2/D = N2+ Fo with No an integer and 0 < Fy < 1. 
3VD = N3+ F3 with N3 an integer and 0 < F3 < 1. 


YVD= Ny + Fy with Ny an integer and 0 < Fy <1. 


Our pigeons are the Y + 1 numbers Fo, F),..., Fy. All the pigeons are be- 
tween 0 and 1, so they are all sitting in the interval 0 < t < 1. We form Y 
pigeonholes by dividing up this interval into Y pieces of equal length. In other 
words, we take as pigeonholes the intervals 


Pigeonhole 1: UF ear ane Bh coe 
Pigeonhole 2: AB ei av A Ee 
Pigeonhole 3: 26 Sealy 


Pigeonhole Y: (Kol) (¥ <te yy, 


Each pigeon is roosting in one pigeonhole, and there are more pigeons than holes, 
so the Pigeonhole Principle assures us that some hole contains at least two pigeons. 
Figure 33.1 illustrates the pigeons and pigeonholes for D = 13 and Y = 5, where 
we see that Pigeon 0 and Pigeon 5 are both nesting in Pigeonhole 1. 

We now know that there are two pigeons, say pigeons F;,, and F’,, that are in 
the same pigeonhole. We label these two pigeons so that m < n. Notice that 
the pigeonholes are quite narrow, only measuring 1/Y from side to side, so the 
distance between F,,, and F’, is less than 1/Y. In mathematical terms, 


|\Fin — Fr| < 1/Y. 
Next we use the fact that 
mV D = Nm + Fn and nVD =Np+ Fa 
to rewrite this inequality as 


|(mVD — Nn) (nvD - Nn) < 1/Y. 
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Pigeon0 Pigeonl Pigeon2 Pigeon3 Pigeon4 Pigeon 5 


& AKAHS 


0.00000 0.60555 0.21110 0.81665 0.42221 0.02778 


vepeee ta as as Pigeonhole Pigeonhole Pigeonhole 


3 4 5 
1 2 2 3 3 4 4 
O<t<h tet<? 2et<38 2et<4 4et<1 


Figure 33.1: Pigeons and Pigeonholes for D = 13 and Y = 5 


Rearranging the terms on the left-hand side gives 
(Nn — Nm) —(n—m)VD| < 1/Y. 


Note that the quantities NV, — N» and n — m™ are (positive) integers. If we call 
them x and y, respectively, then we have accomplished our aim of making the 
quantity |x —yV/D | quite small. 

Our final task is to estimate the size of the integer y = n — m. The numbers m 
and n were chosen so that among the pigeons Fo, F),..., Fy, the pigeons Fi, 
and F;, are in the same hole. In particular, m and n are between 0 and Y, and since 
we chose them with n > m, we have 0 < m<n< Y. It follows that y satisfies 


Oe ya Ye 


In summary, we have shown that for any integer Y, we can find integers x and y 
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such that 
O<y<Y and lz - yVD| < 1/Y. 


Furthermore, by taking Y larger and larger, we automatically get new x’s and y’s. 
This is true since for any fixed x and y the inequality 


le —~yVD | <1/Y 
is false when Y is large enough.* Finally, we make the trivial observation that 
1/Y < 1/y, which completes the proof of the following theorem of Dirichlet. 


Theorem 33.1 (Dirichlet’s Diophantine Approximation Theorem). (Version 1) 
Suppose that D is a positive integer that is not a perfect square. Then there are 
infinitely many pairs of positive integers (x, y) such that 


lz -yvD| ay, 


We can use our table with D = 18 to illustrate Dirichlet’s Diophantine Approx- 
imation Theorem. There are seven pairs of numbers (z, y) in the table satisfying 
the inequality 


c - yVD| <1/y 
They are 
(As L205 AS) CL Sss (36; 10)5 (O83) (17,36): 


This looks like a lot, but such pairs are actually rather rare.* If we were to extend 
the table up to y = 1000, we would find four more pairs: 


(256, 71), (393, 109), (649, 180), (1298, 360); 
and even if we go up to y = 5000, we only find an additional four pairs: 
(4287, 1189), (4936, 1369), (9223, 2558), (14159, 3927). 


There is an entire subject, called the theory of Diophantine Approximation, 
which deals with the approximation of irrational quantities by rational numbers. 


3We are implicitly using the fact that D is not a perfect square, since otherwise the quantity 
|x — yVD| could equal 0. 

4Of course, in some sense such pairs are not rare, since there are infinitely many of them, and 
at first it seems nonsensical to call an infinitely available resource “rare.” However, such pairs are 
extremely rare among the set of all pairs of whole numbers. This is similar to our observation in 
Chapter 13 that “most” numbers are composite numbers, despite the fact that there are also infinitely 
many prime numbers. 
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In Dirichlet’s Diophantine Approximation Theorem, the irrational number D 
is being approximated by the rational number 2/y, since dividing both sides of 
Dirichlet’s inequality by y gives 

1 
< ee 


This shows clearly that if y is large, then x/y is extremely close to VD. 

If you look back at our proof of Dirichlet’s Diophantine Approximation The- 
orem, you will see that we never really used the fact that VD is the square root 
of D. All we really needed to know was that /D is not itself a rational number. 
So what we really proved is the following much more general result. 


Theorem 33.2 (Dirichlet’s Diophantine Approximation Theorem). (Version 2) 
Suppose that a > 0 is an irrational number. That is, a is a real number that is not 
a fraction a/b. Then there are infinitely many pairs of positive integers (x,y) such 
that 


1 
lz — yal < -. 
y 
For example, we could take a to be 
T = 3.141592653589793238462643383... . 
The following table lists all the (x, y)’s with y < 500 such that 
lz —ym| < 1/y, 


together with the values of |x — yz|-y and x/y. More precisely, since we’re mainly 
interested in the ratio z/y, the table only lists pairs with gcd(z, y) = 1. 


£ y | |z—ynl-y z/y 
3 1 0.141593 | 3.0000000000 
19 6 0.902664 | 3.1666666667 
i 0.061960 | 3.1428571429 
| 333 | 106 0.935056 | 3.1415094340 
[eee seis 0.003406 | 3.1415929204 


Notice that the fractions 22/7 and 355/113 are especially close to 7. They have 
been widely used in the past as approximations for 7. We would have to extend 
our search considerably to find a better approximation, since it turns out that the 
next rational number close to 7 is 

103993 


33102 — 3.141592653011903.... 
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We have been using the brute force approach for finding rational approxima- 
tions to irrational numbers. For each y, we chose the integer x closest to ya and 
then checked to see how close x/y comes to a. There is a more systematic method 
for finding the best x/y’s based on the theory of continued fractions. We will study 
continued fractions in Chapters 47 and 48, and you can read further about them 
in Davenport’s The Higher Arithmetic or any standard text on elementary number 
theory. To illustrate the power of continued fractions, we mention that they can be 
used to find the rational numbers 


5419351 
—— = 3.141592653589815383... 
1725033 
and 
21053343141 
6701487259. = 3.1415926535897932384623817..., 


which approximate z to 13 and 21 decimal places, respectively. Clearly, we would 
not want to look for such examples using the brute force approach! Continued 
fractions also provide an efficient method for solving Pell’s equation, even when 
the solution is extremely large. In the exercises you will see how the continued 
fraction method is used to find close rational approximations to a certain number 
called the Golden Ratio. 


Exercises 


33.1. Prove Version 2 of Dirichlet’s Diophantine Approximation Theorem. 


33.2. The number 
_1+Vv5 
22 
is called the Golden Ratio, a term often erroneously ascribed to the ancient Greeks. 
(a) For each y < 20, find the integer 2 making |x — yy| as small as possible. Which 
rational number x/y with y < 20 most closely approximates y? 
(b) If you have access to a computer, find all pairs (x, y) satisfying 


y = 1.61803398874989 ... 


21<y< 1000, ged(z,y)=1, and |x—yy| < 1/y. 


Compare the values of x/y and y. 
(c) Find out why 7 is called the Golden Ratio, and write a paragraph or two explaining 
the mathematical significance of 7y and how it appears in art and architecture. 


33.3. Consider the following rules for producing a list of rational numbers. 


e The first number is 7; = 1. 
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e The second number is r2 = 1+ 1/r; = 14+ 1/1 =2. 
e The third number is r3 = 1+ 1/rg =1+1/2 = 3/2. 
e The fourth number is rg = 14+ 1/r3 = 1+ 2/3 = 5/3. 


In general, the n number in the list is given by r, = 14+ 1/rp-1. 
(a) Compute the values of 71, r2,..., 1710. (You should get rig = 89/55.) 
(b) Let y = $(1 + V5) be the Golden Ratio. Compute the differences 


rr — 41, Ire — YI, see Irio — 71 


as decimals. Do you notice anything? 

(c) If you have a computer or programmable calculator, compute r29, 739, and r49 and 
compare them with 7. 

(d) Suppose that the numbers in the list r,, r2, 73, ... get closer and closer to some num- 
ber r. (In calculus notation, r = limy-,0 Tn.) Use the fact that r, = 1+ 1/rp_1 to 
explain why r should satisfy the relation r = 1 + 1/r. Use this to show that r = 7, 
thereby explaining your observations in (b) and (c). 


(e) Look again at the numerators and denominators of the fractions r1,72,7r3,.... Do 
you recognize these numbers? If you do, prove that they have the value that you 
claim. 


33.4. Dirichlet’s Diophantine Approximation Theorem tells us that there are infinitely 
many pairs of positive integers (x,y) with |x — yV/2 | < 1/y. This exercise asks you to 
see if we can do better. 

(a) For each of the following y’s, find an x such that Bs —yVJ2 | <p 


y = 12,17, 29, 41, 70, 99, 169, 239, 408, 577, 985, 1393, 2378, 3363. 


(This list gives all the y’s between 10 and 5000 for which this is possible.) Is the 
value of |x —yvV2 | ever much less than 1/y? Is it ever as small as 1/y?? A good way 


to compare the value of |x —yvV2 | with 1/y and 1/y? is to compute the quantities 
ya — yV/2 | and y? |x — yv 2. Can you make a guess as to the smallest possible 
value of y|z = yv2|? 
Prove that the following two statements are true for every pair of positive integers 
(x,y): 

G@),, |e? =2y?| > 1. 

(i) If |x — yV2| < 1/y, then x + yV2 < 2yv2 + 1/y. 


Now use (i) and (ii) to show that 


[x — yv9| Ps 


(b 


—_ 


: for all pairs of positive integers (x, y) 
2yvV2 + I/y oe 


Does this explain your computations in (a)? 


Chapter 34 


Diophantine Approximation 
and Pell’s Equation 


We now return to the problem of finding solutions to Pell’s equation 
g* — Dy* =1. 


As we observed in the last chapter, we should look for solutions among those pairs 


(x, y) making [a —y/D | small, since any solution to Pell’s equation satisfies 


The idea we use is to take two pairs for which 7* — Dy? has the same value and 
“divide them.” 

An example helps illustrate what we mean. We take D = 13. Looking at 
the table in Chapter 33, we see that the pairs (71, y1) = (11,3) and (x2, y2) = 
(119, 33) are both solutions to the equation x? — 13y” = 4. We “divide” these two 
solutions as follows: 


119 — 33713 _ Gece (Hae | 


1d = 34/18 11 = 3/13 11 +3713 
_ 22-6V'13 
ie 4 
ie 3 


eS Sas. 
Ee as 
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Voila! The pair (11/2, 3/2) is a solution to Pell’s equation x? — 13y? = 1. Unfor- 
tunately, as you may already have noticed, it is not a solution in integers. The dif- 
ficulty is the appearance of that pesky 2 in the denominator. More precisely, notice 
that there was a 4 in the denominator coming from the fact that 11? — 32-13 =4:; 
and we were only able to cancel 2 out of the denominator. 

Maybe if we look for more solutions to oe — 13y? = 4, we’ll find one that 
allows us to cancel the entire 4 in the denominator. Searching for additional solu- 
tions, we eventually find (14159, 3927) and, using this solution as our (x2, y2), we 
calculate 


14159 — 3927V13 — 2596 — 720V/13 
11 — 3713 4 


Eureka! The Pell equation x? — 13y? = 1 has the solution in integers (x,y) = 
(649, 180). 

Why did the pairs (11, 3) and (14159, 3927) successfully lead to a solution in 
integers? It turns out that these pairs got rid of the 4 in the denominator because 


— 649 — 180V13. 


11=14159(mod4) and 3 = 3927 (mod 4). 


Armed with this crucial observation, we are finally ready to verify Pell’s Equation 
Theorem as stated in Chapter 32. For your convenience, we restate it here. 


Theorem 34.1 (Pell’s Equation Theorem). Let D be a positive integer that is not 
a perfect square. Then Pell’s equation 


a? — Dy* =1 


always has solutions in positive integers. If (x1, y1) is the solution with smallest x1, 
then every solution (xx, yx) can be obtained by taking powers 


k 
oe + yeVD = (a1 + VD) Pra 


Proof. Our first goal is to show that Pell’s equation has at least one solution. Ver- 
sion 1 of Dirichlet’s Diophantine Approximation Theorem (Theorem 33.1) tells 
us that there are infinitely many pairs of positive integers (x,y) that satisfy the 
inequality 


1 
lx — yVD| < y 
Suppose that (x, y) is such a pair. We want to estimate the size of 


|x? — Dy?| = lz - yvD| le + yvDI. 
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The first factor on the right is less than 1/y. What can we say about the second 
factor? 
Using the fact that |x = yvD| < 1/y, we see that x is bounded by 


aa yVD + 1/y, 
and so 


a+yVD < (yD + 1/y) +yVD < 2yVD + 1/y < 3yVD. 
Multiplying both sides of this last inequality by |x —y/D | gives 
a? — Dy?| < le - yVD| -38yVD < (1/y) - (3yVD) = 3VD. 


To recapitulate, we have shown that every solution in positive integers (x, y) to 
the inequality 


lz - yvD| < 1/y 


also satisfies the estimate 
|x? — Dy?| < 3VD. 


We now use a variant of the Pigeonhole Principle introduced in Chapter 33. Our 
pigeons are positive integer solutions (x, y) to the inequality |x —~y/D | <P ay 
Version 1| of Dirichlet’s Diophantine Approximation Theorem (Theorem 33.1) tells 
us that there are infinitely many pigeons.! For pigeonholes we take the integers 


BE eA ag eer tay me ee BO Va Jars ee ee PAT ee EC 


where T is the largest integer less than 3/D. We know that if (x,y) is a pi- 
geon, then the quantity x? — Dy? is between —T and T, so we can assign the 
pigeon (x, y) to the pigeonhole numbered x? — Dy’. 

We’ve now taken infinitely many pigeons and stuffed them into a finite col- 
lection of pigeonholes!?_ Clearly, there must be some pigeonhole that contains 
infinitely many pigeons. Say pigeonhole M contains infinitely many pigeons. (To 
simplify the exposition, we assume that is positive. The argument for negative 
M is very similar and is left for you to do.) In mathematical terms, this means that 
the “Pell-like” equation 

x? — Dy? =M 


'Don’t worry, you won’t be assigned the job of feeding the pigeons, nor will you have to clean 
out the pigeonholes. 

*This is a task akin to, but messier than, that of getting infinitely many angels to dance on the 
head of a pin. Which brings up a question you may care to ponder: “To what extent is the Pigeonhole 
Principle an Obvious Truth, and to what extent is it an Act of Faith?” 
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has infinitely many solutions in positive integers (x,y). We write the list of solu- 
tions as 
(X1, Yi), (Xo, Y2), (X3, Y3), (Xa, Y4), cree 


Keep firmly in mind that this list continues indefinitely. 
Following the path suggested by the example at the beginning of this chapter, 
we look for two solutions (X;, Y;) and (Xx, Y;,) that also satisfy 


X; = X; (mod M) and Y;=Y; (mod M). 


We’ll find them by once again using the Pigeonhole Principle. This time our pi- 
geons are the solutions (X1, Y;), (X2, Y2),..., so we have infinitely many pigeons. 
The pigeonholes are the pairs 


(A,B)  withO<A<Mand0O<B<M, 


so there are M? pigeonholes. We assign each pigeon (X;, Y;) to a pigeonhole by 
reducing the numbers X; and Y; modulo M. In other words, the pigeon (X;, Y;) is 
assigned to the pigeonhole (A, B) by choosing A and B to satisfy 


X;=A(modM), Y;=B(modM), 0<A,B<M. 


We have again managed to stuff infinitely many pigeons into a finite number of 
pigeonholes, so again there must be some pigeonhole containing infinitely many 
pigeons. In particular, we can find two different pigeons (X;,Y;) and (Xx, Yx) 
nesting in the same hole. Mathematically, we have produced two pairs of positive 
integers (X,;, Y;) and (X,, Y;,) with the following properties: 

X;=X;,(modM), X?-DY?7=M, 

Y;=Y, (mod M), X?Z-DYZ=M. 
As described earlier in this chapter, we now expect to get a solution (z, y) to Pell’s 
equation x7 — Dy? = 1 by setting 

dep lies Ot Y;VD _ (XjXn —Yj¥pD) + (XjVe — X4Yj)VD 

Xt — Yzx/WD xX, — DY; ) 


In other words, we claim that the formulas 


XjX~ — YjY;,D X iY, — XEY; 
a d = _ 2 ee 
x an Yy M 


give a solution to z* — Dy? = 1 in integers. 
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First we check that (2, y) satisfies Pell’s equation. 
DON VD Ve MYX 
2 2 jok jtk jtk ktj 
BER gy ee ea a ae ey 9 pei et 8 
a a 
CG DY XG = Di) 
M2 


= le 
Second, we must verify that x and y are integers. Using the congruences 
X; = X; (mod M) and Y; = Y, (mod M), 
we find that the “numerators” of x and y satisfy 


X;Xi,— Yj¥,D =X? —Y?D = M =0 (mod M), 
XjY_ — XnY¥j = XjY; — XjY; =0 (mod M). 


Thus the numerators are divisible by /, so the M’s in the denominators can be 
canceled. This shows that x and y are indeed integers, and replacing them by their 
negatives if necessary, we have found a solution to Pell’s equation x? — Dy? = 1 
in integers x,y > 0. 

Clearly, z > 1. It remains to show that y # 0. But if y = 0, then X;Y, = 
X,Y;, $0 we find that 


Y¥eM = Ye(X} — DY?) = (X;¥,)? — D(Yj¥h)? 
= (X,Y;)? — D(VjYe)? = ¥7(X¢ - DYg) = YPM. 


However, we chose Y; and Y; to be positive and unequal, so this cannot happen. 
Therefore, y 4 0, and we have found a solution (x, y) to Pell’s equation in positive 
integers. This completes the proof of the first half of Pell’s Equation Theorem. 

For the second half, we let (x1, y:) be the solution in positive integers with 
smallest 71, and we need to show that every solution is obtained by taking powers 
of x1 + y1VD. We could reprise the proof that we gave in Chapter 31 when 
D = 2, but instead we present a different and interesting proof that is useful in 
more general situations. 

Suppose that (u,v) is any solution to x? — Dy? = 1 in positive integers. We 
consider the two real numbers 


z=a,+yVD and r=utovD. 
The number z satisfies z > 1, so the number r lies between two powers of z, say 


zk pe eee geet 
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[To be precise, take k to be the greatest integer less than log(r)/log(z).] Dividing 
by z* gives 

(ine stages gate 
We observe that z* = x; + yx VD, since that is how we defined x, and y;,, and 
hence z~* = Lk — Yk WD. since we know that 


(wx + yzV D) (xz — yeVD) = a - Dy? =]. 
Thus 


zap (cp —- yzV D) (u + vVD) = (xpu — ygvD) + (xpv — ypu)VD. 
call this s call this t 


We know the following three facts about s and t: 


(L\.6* = De 1. 
Oe Pi D'S 1, 
(3) st+tVD <z. 


We claim that s > 0 and t > 0. To verify this claim, we eliminate the other 
possibilities. Fact (2) shows immediately that s and ¢ cannot both be negative. 
Suppose that s > 0 and t < 0. Then Fact (2) tells us that s—tV/D > s+tvV/D > 1, 
so using Fact (1) gives 


1=s?— Dt? = (s—tVD)(s+tVD) >1. 


This is impossible, so we cannot have s > 0 andt < 0. Similarly, if s < 0 and 
+ >0, then =9 i D Ss 44D = 1, so 


—1=~s? + Dt? = (-s+tVD)(s+tvD) > 1, 


which is also impossible. We have eliminated every possibility except s > O and 
t > 0, which completes the proof of the claim. 

We now know that (s,¢) is a solution to rz? — Dy* = 1 in nonnegative integers. 
If s and t were both positive, then the assumption that (x1, y1) is the smallest such 
solution would imply that s > x1. Furthermore, 


2 2 
s“—1 zi —1 
eS > =i, 


so we also find that t > y;, and hence 


s+tVD>a1+yVD =z. 
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This contradicts the inequality s + t/D < z in Fact (3). Thus, although s and t 
are both nonnegative, they are not both positive. So one of them is zero, and from 
s? — Dt? = 1, we must have t = 0 and s = 1. 

To recapitulate, we have shown that z~* - r is equal to 1, which is the same as 
saying that r = z*. In other words, we have shown that if (u,v) is any solution to 
x* — Dy? = 1 in positive integers then there is some exponent k > 1 such that 


r=u+tvuvD is equal to 2* = (21 +y:VD)* = oy + yeVD. 


This shows that u + vv D is a power of x1 + y1 VD, which completes the proof of 
Pell’s Equation Theorem. LO 


Exercises 


34.1. In this chapter we have shown that Pell’s equation x? — Dy? = 1 always has a so- 
lution in positive integers. This exercise explores what happens if the 1 on the nght-hand 
side is replaced by some other number. 

(a) For each 2 < D < 15 that is not a perfect square, determine whether or not the 
equation x? — Dy? = —1 has a solution in positive integers. Can you determine a 
pattern that lets you predict for which D’s it has a solution? 

(b) If (0, yo) is a solution to x? — Dy? = —1 in positive integers, show that (x2 + 
Dy2, 2x yo) is a solution to Pell’s equation x? — Dy? = 1. 

(c) Find a solution to x” — 41y? = —1 by plugging in y = 1,2,3,... until you find a 
value for which 41y? — 1 is a perfect square. (You won’t need to go very far.) Use 
your answer and (b) to find a solution to Pell’s equation x? — 41y? = 1 in positive 
integers. 

(d) If (xo, yo) is a solution to the equation x? — Dy? = M, and if (x1, y1) is a solution 
to Pell’s equation x? — Dy? = 1, show that (x97, + Dyoy1, Zoy1 + yo21) is also a 
solution to the equation x? — Dy? = M. Use this to find five different solutions in 
positive integers to the equation x? — 2y? = 7. 


34.2. For each of the following equations, either find a solution (x, y) in positive integers 
or explain why no such solution can exist. 


(a) a? = Tle? = 7 (b) x? — 1ly? = 433 (ce): 27 = 1147 =3 


Chapter 35 


Number Theory and Imaginary 
Numbers 


Most everyone these days is familiar with the “number” 
t— Wal, 


The use of “2” to denote the square root of negative 1 dates back to the days when 
people viewed such numbers with great suspicion and, indeed, felt that they were 
so far from being real numbers that they deserved to be called imaginary. In these 
more enlightened times we recognize that all' numbers are, to some extent, ab- 
stractions that can be used to solve certain sorts of problems. For example, negative 
numbers (which were not used by European mathematicians even in the fourteenth 
century, although they were in use in India as early as AD 600) are not needed for 
counting cattle, but they are useful in keeping track of who owes how many cattle 
to whom. Fractions arise naturally when people start dealing with objects that can 
be subdivided, such as bushels of wheat or fields of corn. Irrational numbers—that 
is, numbers that are not fractions—appear in even the simplest sorts of measure- 
ments, as the Pythagoreans discovered when they found that the diagonal of certain 
geometric figures may be incommensurable with their sides. This discovery upset 
the conventional mathematical wisdom of the time, and the penalty was death for 
those who revealed the secret. The introduction of imaginary numbers caused sim- 
ilar consternation in nineteenth-century Europe, although thankfully the sanctions 
imposed on those using imaginary numbers were less severe than in earlier times. 
Imaginary numbers and more generally complex numbers, 


z=a+1y, 


‘Or at least, almost all. As Leopold Kronecker (1823-1891) so eloquently put it, “God made the 
integers, all else is the work of man.” 
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were introduced into mathematics for a definite purpose, the solution of equations. 
Thus negative numbers are needed to solve the equation x + 3 = 0, while fractions 
are needed for 3x — 7 = OQ. For the equation x? — 5 = 0, we need the irrational 
quantity \/5, but even if we introduce more general irrational numbers, we still 
won’t be able to solve the very simple equation x? + 1 = 0. Since this equation 
doesn’t have any solutions in “real numbers,” there’s nothing to stop us from creat- 
ing a new sort of number to be a solution and giving that new number the name 7. 
This is no different from observing that since the equation x? — 5 = 0 has no solu- 
tion in fractions, we are free to create a solution and call it \/5. In fact, we’re even 
doing the same thing when we observe that 3x — 7 = 0 has no solutions in whole 
numbers, so we create a solution and call it Z. 

Complex numbers were thus invented’ to solve equations such as x? + 1 = 0, 
but why stop there? Now that we know about complex numbers, we can try to 
solve more complicated equations, even equations with complex coefficients such 
as 

(3 + 2i)a? — (V3 — 5i)a® — (W5 + V14i)2 + 17 — 81 = 0. 


If this equation had no solutions, then we would be forced to invent even more 
numbers. Amazingly, it turns out that this equation does have solutions in complex 
numbers. The solutions (accurate to five decimal places) are 


1.27609 + 0.720357, 0.03296 — 2.118027, —1.67858 — 0.022642. 


In fact, there are enough complex numbers to solve every equation of this sort, a 
statement that has a long history and an impressive name. 


Theorem 35.1 (The Fundamental Theorem of Algebra). Jf ao, a1, a2,...,@q are 
complex numbers with ag # 0 and d > 1, then the equation 


agx? — aa + ayn? ? +---+agq-1% + ag = 0 
has a solution in complex numbers. 


This theorem was formulated (and used) by many mathematicians during the 
eighteenth century, but the first satisfactory proofs weren’t discovered until the 
early part of the nineteenth century. Many proofs are now known, some using 
mostly algebra, some using analysis (calculus), and some using geometric ideas. 
Unfortunately, none of the proofs is easy, so we won’t give one here. 


*Some would say that complex numbers already existed and were merely discovered, while others 
believe just as strongly that mathematical entities such as the complex numbers are abstractions that 
were created by human imagination. This question of whether mathematics is discovered or created 
is a fascinating philosophical conundrum for which (as with most good philosophical questions) there 
is unlikely ever to be a definitive answer. 
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Instead, we investigate the number theory of the complex numbers. You un- 
doubtedly recall the simple rules for adding and subtracting complex numbers, and 
multiplication is just as easy, 


(a + bi)(c + di) = ac + adi + bei + bdi* = (ac — bd) + (ad + bc)i. 
For division we use the old trick of rationalizing the denominator, 


a+bi a+bi c—di (ac+bd)+(-ad+ bc)i 


Che ede cak.- c2 + d? 


We do number theory with a certain subset of the complex numbers called the 
Gaussian integers. These are the complex numbers that look like 


a+ bi with a and b both integers. 


The Gaussian integers have many properties in common with the ordinary integers. 
For example, if a and (@ are Gaussian integers, then so are their sum a + £, their 
difference a — £, and their product a. However, the quotient of two Gaussian 
integers need not be a Gaussian integer (just as the quotient of two ordinary integers 
need not be an integer). For example, 


3+22 —9+ 202 


l= 67 ~~ 337 


is not a Gaussian integer, while 


1OsaLli 220004 


= es 
342% 13 a 


is a Gaussian integer. 

This suggests that we define a notion of divisibility for Gaussian integers just 
as we did for ordinary integers. So we say that the Gaussian integer a + bi divides 
the Gaussian integer c + di if we can find a Gaussian integer e + fz such that 


e+ di = (a+ bt)(e+ fi). 


c+ di 


Of course, this is the same as saying that the quotient is a Gaussian inte- 


ger. For example, we saw that 3 + 22 divides 16 — 111, but 1 se 62 does not divide 
3+ 21. 

Now that we know how to talk about divisibility, we can talk about factoriza- 
tion. For example, the number 1238 — 1484: factors as 


1238 — 14844 = (2 + 3i)? - (-14 42) - (3+ 4)”. 
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And even ordinary integers such as 600 = 2° - 3 - 52, which we think we already 
know how to factor, can be factored further using Gaussian integers: 


600 = —i-(1+4)®-3- (242)? - (2-4). 


For the ordinary integers, the primes are the basic building blocks because they 
cannot be factored any further. Of course, technically this isn’t quite true, since the 
prime 7 can be “factored” as 


7=1-7, oras 7=(-1)-(-1)-7, orevenas 
7 = (-1)- (-1)- (-1)- (-1)-1-1-1-7. 


However, we recognize that these aren’t really different factorizations, because we 
can always put in more 1’s and pairs of —1’s. What is it about 1 and —1 that makes 
them special? The answer is that they are the only two integers that have integer 
(multiplicative) inverses: 


1-1=1 and (-1)-(-1)=1. 


(In fact, they are their own inverses, but this turns out to be less important.) Notice 
that if a is any integer other than 1 and —1 then a does not have an integer mul- 
tiplicative inverse, since the equation ab = 1 does not have a solution 6 that is an 
integer. We say that 1 and —1 are the only units in the ordinary integers. 

The Gaussian integers are blessed with more units than are the ordinary inte- 
gers. For example, 7 itself is a unit, since 


i-(-i) =1. 


This equation also shows that —2 is a unit, so we see that the Gaussian integers 
have at least four units: 1, —1, 7, and —2. Are there any others? 

To answer this question, we suppose that a + 67 is a unit in the Gaussian inte- 
gers. This means that it has a multiplicative inverse, so there is another Gaussian 
integer c + di such that 

(a+ bi)(e + di) = 1. 


Multiplying everything out, we find that 
ac—bd=1 and ad + bc = 0, 


so we are looking for solutions (a, b, c, d) to these equations in ordinary integers. 
Later we will see a fancier (and more geometric) way to solve this problem, but for 
now, let’s just use a little algebra and a little common sense. 
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We need to consider several cases. First, if a = 0 then —bd = 1,s0b = +1 
and a + bt = +2. Second, if b = 0, then ac = 1,soa = tlanda+ ht = +1. 
These first two cases lead to the four units that we already know. 

For our third and final case, suppose that a # 0 and b ¥ 0. Then we can solve 
the first equation for c and substitute it into the second equation: 


1 + bd 
c= 


a 


0. 


2 2 
= ad+b(—=™) =0 = a'd+b+b'd _ 


a 


Thus any solution with a 4 0 must satisfy 
(a? + b?)d = —b. 


This means that a? + b? divides b, which is absurd, since a? + b? is larger than b 
(remember neither a nor b is 0). This means that Case 3 yields no new units, so we 
have completed the proof of our first theorem about the Gaussian integers. 


Theorem 35.2 (Gaussian Unit Theorem). The only units in the Gaussian integers 
are 1, —1, 1, and —1. That is, these are the only Gaussian integers that have 
Gaussian integer multiplicative inverses. 


One thing that makes the Gaussian integers an interesting subset of the com- 
plex numbers is that the sum, difference, and product of any two Gaussian integers 
are again Gaussian integers. Notice that the ordinary integers also have this prop- 
erty. A subset of the complex numbers that has this property (and also contains the 
numbers 0 and 1) is called a ring, so the ordinary integers and the Gaussian inte- 
gers are examples of rings. Many other interesting rings lurk within the complex 
numbers, some of which you will have an opportunity to study in the Exercises for 
this chapter. 

Returning to our discussion of factorization in the Gaussian integers, we might 
say that a Gaussian integer a is prime if it is only divisible by +1 and itself, but 
this is clearly the wrong thing to do. For example, we can always write 


a=i-(-t)-a, 


so any a is divisible by 7 and by —7 and by za and by —7a. This leads us to the 
correct definition. A Gaussian integer a is called a Gaussian prime if the only 
Gaussian integers that divide a are the eight numbers 


1, -l, it, -t, a, -a, ta, and —v1a. 


In other words, the only numbers dividing a@ are units and a times a unit. 
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(a) Complex Numbers in the Plane (b) The Gaussian Integers 


Figure 35.1: The Geometry of the Complex Numbers 


Now that we know what Gaussian primes are, can we identify them? For ex- 
ample, which of the following do you think are Gaussian primes? 


a ees es Ss ee eee 


We could answer this question using the algebraic ideas we employed earlier when 
we determined all the Gaussian units, but our task is easier if we use a soupcon of 
geometry. 

We introduce geometry into our study of complex numbers by identifying each 
complex number x + yi with the point (x,y) in the plane. This idea is illus- 
trated in Figure 35.1(a). The Gaussian integers are then identified with the integer 
points (x, y), that is, the points with x and y both integers. Figure 35.1(b) shows 
that the Gaussian integers form a square-shaped lattice of points in the plane. 

Having identified the complex number x + yi with the point (x, y), we can talk 
about the distance between two complex numbers. In particular, the distance from 
x + yi to 0 is \/x? + y?. It is a little more convenient to work with the square of 
this distance, so we define the norm of x + yi to be 


N(x + yi) = 274+ y’. 


Intuitively, the norm of a complex number a is a sort of measure of the size or 
magnitude of a. In this sense, the norm measures a geometric quantity. On the 
other hand, the norm also has a very important algebraic property: The norm of a 
product equals the product of the norms. It is the interplay between these geometric 
and algebraic properties that makes the norm such a useful tool for studying the 
Gaussian integers. We now verify the multiplication property. 


Theorem 35.3 (Norm Multiplication Property). Let a and 8 be any complex num- 
bers. Then 


N(aB) = N(a) N(A). 
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Proof. If we write a = a+ bi and 6 = c+ di, then 
a3 = (ac — bd) + (ad + bc)i, 
so we need to check that 
(ac — bd)? + (ad + bc)? = (a? + b?)(c? + a2). 


This is easily verified by multiplying out both sides, a task that we leave for you 
(or look back at Chapter 24). L 


Before returning to the problem of factorization, it is instructive to see how the 
norm can be used to find the units. Thus, suppose that a = a+ bi is a unit. This 
means that there is a 8 = c+ di such that af = 1. Taking norms of both sides and 
using the Norm Multiplication Property yields 

N(a) N(8) = N(aB) = N(1) = 1, 
SO 


(a? + b*)(c? + d*) = 1. 


But a, b, c, d are all integers, so we must have a? + 6? = 1. The only solutions to 
this equation are 


(a, b) a (1,0), (=), (0, 1), (0,2); 


which gives us a new proof that the only Gaussian units are 1, —1, 2, and —7. We 
also obtain a useful characterization: 


A Gaussian integer a is a unit if and only if N(a) = 1. 


Now let’s try to factor some numbers. We start with the number 2 and try to 
factor it as 
(a+ bi)(c+ di) = 2. 


Taking the norm of both sides yields 
(a? + b*)(c* + d’) = 4. 


[Note that N(2) = N(2 + 01) = 2? + 0? = 4.] We don’t want either of a + bi or 
c+ dito be a unit, so neither a? + b? nor c? + d? is allowed to equal 1. Since their 
product is supposed to equal 4 and they’re both positive integers, we must have 


ath =2 and 4+ d* =2. 
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These equations certainly have solutions. For example, if we take (a,b) = (1, 1) 
and divide 2 by a + bi = 1 + 7, we get 


AO Na) 


ace =e 
ea D : 


ct+di= 
Hence 2 = (1 +7)(1 — 7), so 2 is not a Gaussian integer prime! 
If we try to factor 3 in a similar fashion, we end up with the equations 


a? +07 =3 and +d =3. 


These equations clearly have no solutions, so 3 is a Gaussian prime. On the other 
hand, if we start with 5, we end up with the factorization 5 = (2 + 7)(2 —1). 

We can use the same procedure to factor Gaussian integers that are not ordinary 
integers. The general method for factoring a Gaussian integer a is to set 


(a+ bi)(e+di)=a 
and take the norm of both sides to obtain 
(a? + b”)(c*? +d”) = N(a). 


This is an equation in integers, and we want a nontrivial solution, by which we 
mean a solution where neither a? + b* nor c? + d? equals 1. So the first thing we 
need to do is factor the integer N(a) into a product AB with A # 1 and B F 1. 
Then we need to solve 


a+be=A and f+@=B. 


Thus, factorization of Gaussian integers leads us back to the sums of two squares 
problem that we studied in Chapters 24 and 25. 

To see how this works in practice, we factor a = 3+ 7%. The norm of a is 
N(a) = 10, which factors as 2 - 5, so we solve a? + b* = 2 and c* +d? = 5. There 
are a number of solutions. For example, if we take (a, b) = (1,1), then we obtain 
the factorization 

34+7=(1+i)(2—-%). 


Do you understand why we get several solutions? It has to do with the fact that the 
factorization of 3 + i can always be changed by units. Thus, if we take (a,b) = 
(—1,1), we get 3+ 74 = (—1+7)(—1 — 21), which is really the same factorization, 
since —1 +7 = i(1+7%) and —1 — 2i = —i(2 — 1). 

What happens when we try to factor a = 1 + i? The norm of a is N(a) = 2, 
and 2 cannot be factored as 2 = AB with ordinary integers A,B > 1. This 
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means that a has no nontrivial factorizations in the Gaussian integers, so it is prime. 
Similarly, we cannot factor 2 + 37 in the Gaussian integers, since N(2 + 32) = 13 
iS a prime in the ordinary integers. [But note that 13 is not a Gaussian prime, since 
13 = (2+ 32)(2 — 32).] So 2 + 37 is a Gaussian prime. More generally, if N(a) 
is an ordinary prime, then the same reasoning shows that a must be a Gaussian 
prime. It turns out that these are half the Gaussian primes, and the other half are 
numbers like 3 that are both ordinary primes and Gaussian primes. The following 
theorem gives a complete description of all Gaussian primes. Don’t be fooled 
by the shortness of its proof, which merely reflects all our hard work in earlier 
chapters. It is a deep and beautiful result. 


Theorem 35.4 (Gaussian Prime Theorem). The Gaussian primes can be described 
as follows: 
(i) 1+ 2is a Gaussian prime. 

(ii) Let p be an ordinary prime with p = 3 (mod 4). Then p is a Gaussian prime. 
(iii) Let p be an ordinary prime with p = 1 (mod 4) and write p as a sum of two 
squares p = u? + v2 (see Chapter 24). Then u + vi is a Gaussian prime. 
Every Gaussian prime is equal to a unit (+1 or +1) multiplied by a Gaussian prime 

of the form (i), (ii), or (iii).? 


Proof. As we observed previously, if N(q@) is an ordinary prime, then a must be a 
Gaussian prime. The number 1 + 7 in category (i) has norm 2, so it is a Gaussian 
prime. Similarly, the numbers u + vi in category (iii) have norm u? + v2 = p, so 
they are also Gaussian primes. 
Next we check category (11), so we let a = p be an ordinary prime with 

p = 3(mod 4). If a@ were to have a factorization into Gaussian integers, say 
(a + bi)(c + dt) = a, then taking norms would yield 

(a? + b?)(c? +d?) = N(a) = p’. 
To get a nontrivial factorization, we would need to solve 

a7+b%=p and ef4+d?= 


But we know from the Sum of Two Squares Theorem (Chapter 24) that since p = 
3 (mod 4), it cannot be written as a sum of two squares, so there are no solutions. 
Therefore, p cannot be factored, so it is a Gaussian prime. 


3s you know, mathematicians love to give obscure names to the objects they study. In this in- 
stance, category (i) primes are called ramified, category (ii) primes are called inert, and category (iii) 
primes are called split. This means that if we cover a prime in category (iii) with ice cream and fruit, 
it becomes a banana split prime! 
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We’ve now shown that the numbers in categories (1), (ii), and (111) are indeed 
Gaussian primes, so it remains to show that every Gaussian prime fits into one of 
these three categories. To do this, we use the following lemma. 


Lemma 35.5 (Gaussian Divisibility Lemma). Let a = a+bi be a Gaussian integer. 

(a) If 2 divides N(a), then 1 + i divides a. 

(b) Let = p be a category (ii) prime, and suppose that p divides N(a) as 
ordinary integers. Then 7 divides a as Gaussian integers. 

(c) Let t = u+ vi be a Gaussian prime in category (iii), and let T = u — v1. 
(This is a natural notation, since 7 is indeed the complex conjugate of the complex 
number 1.) Suppose that N(m) = p divides N(a) as ordinary integers. Then at 
least one of m and 7% divides a as Gaussian integers. 


Proof of Lemma. (a) We are given that 2 divides N(a) = a? + b?, so a and b are 
either both odd or both even. It follows that a + 6 and —a + 0 are both divisible 
by 2, so the quotient 

a+bi  (a+b)+(-a+b)i 

1+i 2 
is a Gaussian integer. Hence a + 07 is divisible by 1 + 7. 
(b) We are given that p = 3 (mod 4) and that p divides a? + b?. This means that 
a? = —b* (mod p), so we can compute the Legendre symbols 


a -G)-G- GG) 
p p p Pp) \p) - 
Since p = 3 (mod 4), the Law of Quadratic Reciprocity (Chapter 21) tells us that 


=) = —1, So we get 


But the value of a Legendre symbol is +1, so we seem to have ended up with 
1 = —1. What went wrong? Stop and try to figure it out for yourself before you 
read on. 

The answer is that a Legendre symbol such as (=) only makes sense if a # 
0 (mod p); we never assigned a value to G): So the egress from our seeming 
contradiction is that a and b must both be divisible by p, say a = pa’ and b = pl’. 
Hence a = a+ bi = p(a’ + b’A) is divisible by p = 7, which is what we were 
trying to prove. 

(c) We are given that p divides N(a), so we can write 


N(a) = a? +b? = pK for some integer kK > 1. 
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We need to show that at least one of the two numbers 


a _ (au+ bv) + (—av + bu)i _ (au — bv) + (av + bu)i 


S$ — and 
Pp 


us Pp 


31/2 


is a Gaussian integer. 
Our first observation is that 
(au + bv)(au — bv) = a?u? — b*v? 
= azu2 ia b2(p = u?) 
= (a* + b?)u? — pb? 
= pKu? — pb, 


so at least one of the two integers au + bu and au — bv is divisible by p. A similar 
calculation shows that 


(—av + bu) (av + bu) = pKu? — pa’, 

so at least one of —av + bu and av + bu is divisible by p. There are thus four cases 
to consider: 

Case 1. au + bu and —av + bu are divisible by p. 

Case 2. au + bu and av + bu are divisible by p. 

Case 3. au — bu and —av + bu are divisible by p. 

Case 4. au — bu and av + bu are divisible by p. 

Case 1 is easy, since it immediately implies that the quotient a/7 is a Gaussian 

integer, hence 7 divides a. Similarly, for Case 4, the quotient a/7 is a Gaussian 
integer, so 7 divides a. This takes care of Cases 1 and 4. 


Next consider Case 2, which is a little more complicated. We are given that p 
divides both au + bv and av + bu, from which we deduce that p divides 


(au + bv)b — (av + bu)a = (b? — a?)v. 


(The idea here is that we “eliminated” wu from the equation.) Since p clearly doesn’t 
divide v (remember that p = u? + v?), we see that p divides b* — a?. However, we 
also know that p divides a? + b*, so we find that p divides both 


2a? = (a? + b*) — (b? — a”) and 2b? = (a? + b?) + (b? — a’). 


Since p # 2, we finally deduce that p divides both a and b, say a = pa’ and b = pb’. 
Then 


a=a+bi=p(a' + vi) = (uv? +0?) + bi) = r7(a' + BA), 
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so for Case 2 we find that a is actually divisible by both z and 7. 
Finally, we observe that a similar argument for Case 3 leads to the same con- 
clusion as in Case 2; we leave the details for you to complete. L 


Resumption of Proof. After that (not so) brief interlude, we’re ready to resume 
proving the Gaussian Prime Theorem. We suppose that a = a+ bi is a Gaus- 
sian prime, and our aim is to show that a fits into one of the three categories of 
Gaussian primes. We know that N(a) 4 1, since a is not a unit, so there is (at 
least) one prime p that divides N(q@). 

Suppose first that p = 2. Then Part (a) of the Lemma tells us that 1 + 7 di- 
vides a. But a is supposed to be prime, so this means that a must equal 1 +72 
multiplied by a unit, so a fits into category (i). 

Next suppose that p = 3 (mod 4). Then Part (b) of the Lemma tells us that p 
divides a, so again the primality of a implies that a equals a unit times p. Hence a 
is a category (ii) prime. 

Finally, suppose that p = 1 (mod 4). The Sum of Two Squares Theorem in 
Chapter 24 tells us that p can be written as a sum of two squares, say p = u2 + v?, 
and then Part (c) of the Lemma says that a is divisible by either u+72v or by u—iv. 
Hence a is a unit times one of w+7v or u—iv. In particular, a2 +b? = u2+v2 = D, 
SO @ is acategory (ili) prime. This completes our proof that every Gaussian prime 
fits into one of the three categories. Ie 


Exercises 


35.1. Write a short essay (one or two pages) on the following topics: 
(a) The introduction of complex numbers in nineteenth-century Europe 
(b) The discovery of irrational numbers in ancient Greece 
(c) The introduction of zero and negative numbers into Indian mathematics, Arabic math- 
ematics, and European mathematics 
(d) The discovery of transcendental numbers in nineteenth-century Europe 


35.2. (a) Choose one of the following two statements and write a one-page essay defend- 
ing it. Be sure to give at least three specific reasons why your statement is true and 
the opposing statement is incorrect. 


Statement 1. Mathematics already exists and is merely discovered by people (in the 
same sense that the dwarf planet Pluto existed before it was discovered in 1930). 


Statement 2. Mathematics is an abstract creation invented by people to describe the 
world (and possibly even an abstract creation with no relation to the real world). 


(b) Now switch your perspective and repeat part (a) using the other statement. 
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35.3. Write each of the following quantities as a complex number. 


3 — 2% 1+i\* 
gs 27) (14+ b 
@ B-2)- +4) wT @ (TE) 
35.4. (a) Solve the equation x? = 95 — 168i using complex numbers. [Hint. First set 
(u + vi)? = 95 — 168i, then square the left-hand side and solve for u and v.] 
(b) Solve the equation x? = 1 + 2i using complex numbers. 


35.5. For each part, check whether the Gaussian integer a divides the Gaussian integer 3 
and, if it does, find the quotient. 

(a) a=3+5i and B=11-8 

(b) a=2-—37 and B=4+4+7i 

(c) a=3-—391 and B=3-5i 

(dq) a=3-—5¢ and B=3-—3%9 


35.6. (a) Show that the statement that a + bz divides c + dz is equivalent to the statement 
that the ordinary integer a? + b? divides both of the integers ac+ bd and —ad-+ be. 
(b) Suppose that a + bi divides c + di. Show that a? + b? divides c? + d?. 


35.7. Verify that each of the following subsets R,, R2,R3, R4 of the complex numbers is 
a ring. In other words, show that if a and / are in the set, then a + 8, a — 6, and af are 
also in the set. 
(a) Ry = {a+ biV/2 : aand bare ordinary integers}. 
(b) Let p be the complex number p = —+ + siV3. 
Ro = {a+ bp : aand b are ordinary integers}. 
[Hint. p satisfies the equation p? + p +1 =0.] 
(c) Let p be a fixed prime number. 
Rz3 = {a/d : aand dare ordinary integers such that p { d}. 
(d) Ra = {a+ bV3 : aand bare ordinary integers}. 


35.8. An element a of a ring R is called a unit if there is an element 6 € R satisfying 
a2 = 1. In other words, a € R is a unit if it has a multiplicative inverse in R. Describe all 
the units in each of the following rings. 
(a) Ry = {a+ bi/2 : aand bare ordinary integers}. 
[Hint. Use the Norm Multiplication Property for numbers a + biv/2.] 
(b) Let p be the complex number p = —t + tiv. 
R2 = {a+ bp : aand bare ordinary integers}. 
(c) Let p be a fixed prime number. 
R3 = {a/d : aand d are ordinary integers such that p { d}. 


35.9. Let R be the ring {a + bV/3 : aand bare ordinary integers}. For any element 
a = a+by3 in R, define the “norm” of a to be N(a) = a? — 3b”. (Note that R is a subset 
of the real numbers, and this “norm” is not the square of the distance from a to 0.) 
(a) Show that N(a@Z) = N(a) N(@) for every a and ( in R. 
(b) If a is a unit in R, show that N(a) equals 1. [Hint. First show that N(a@) must 
equal +1; then figure out why it can’t equal —1.] 
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(c) If N(@) = 1, show that a is a unit in R. 
(d) Find eight different units in R. 
(e) Describe all the units in R. [Hint. See Chapter 34.] 


35.10. Complete the proof of the Gaussian Divisibility Lemma Part (c) by proving that in 
Case 3 the Gaussian integer «@ is divisible by both 7 and 7. 


35.11. Factor each of the following Gaussian integers into a product of Gaussian primes. 
(You may find the Gaussian Divisibility Lemma helpful in deciding which Gaussian primes 
to try as factors.) 


(a) 91+ 632 (b) 975 (c) 53+ 622 


Chapter 36 


The Gaussian Integers 
and Unique Factorization 


We saw in the last chapter that it can be as much fun doing number theory with 
the Gaussian integers as it is doing number theory with the ordinary integers. In 
fact, some might consider the Gaussian integers to be even more fun, since they 
contain even more prime numbers to play with. We saw long ago how the ordinary 
primes are the basic building blocks used to form all other integers, and we proved 
the fundamental result that each integer can be constructed from primes in exactly 
one way. Although the Fundamental Theorem of Arithmetic studied in Chapter 7 
seemed obvious at first, our trip to the “Even Number World” (the E-Zone) con- 
vinced us that it is far more subtle than it initially appeared. 

The question now arises as to whether every Gaussian integer can be written as 
a product of Gaussian primes in exactly one way. Of course, rearranging the order 
of the factors is not considered a different factorization, but there are other possible 
difficulties. For example, consider the two factorizations 


11 — 102 = (3 + 22)(1 — 42) and 11 — 102 = (2 — 32)(44+72). 
They look different, but if you remember our discussion of units, you'll notice that 
3+ 27 =7- (2 — 32) and 1 — 44 = -7- (442). 

So the two supposedly different factorizations of 11 — 102 arise from the relation 

ne nee have had the same problem with the ordinary integers if we had 
allowed both positive and negative prime numbers, since, for example, 6 = 2-3 = 


(—2) - (—3) has two seemingly “different” factorizations into primes. To avoid 
this difficulty, we selected the positive primes as our basic building blocks. This 
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suggests that we do something similar for the Gaussian integers, but clearly we 
can’t talk about positive complex numbers versus negative complex numbers. 

If a = a+ 67 is any nonzero Gaussian integer, then we can multiply a by each 
of the units 1, —1, 2, and —2z to obtain the numbers 


a=a-+t bt, ia = —b+ at, —a = —a— bi, —ia = b— at. 


If you plot these four Gaussian integers in the complex plane, you will find that 
exactly one of them is in the first quadrant. More precisely, exactly one of them 
has its z-coordinate > 0 and its y-coordinate > 0. We say that 


x+yi isnormalizedif x>0 and y>0. 


These normalized Gaussian integers will play the same role as positive ordinary 
integers. 


Theorem 36.1 (Unique Factorization of Gaussian Integers). Every Gaussian in- 
teger a # 0 can be factored into a unit u multiplied by a product of normalized 
Gaussian primes 

Q = UT1712 °° Wy 


in exactly one way. 


As usual, a few words of explanation are required. First, if a is itself a unit, we 
take r = 0 and u = a and let the factorization of a be simply a = wu. Second, the 
Gaussian primes 771,...,7, do not have to be different; an alternative description 
is to write the factorization of a as 


Q = umy m5? +++ 1" 
using distinct Gaussian primes 71,...,7, and exponents e;,...,e, > 0. Third, 
when we say that there is exactly one factorization, we obviously do not consider 
a rearrangement of the factors to be a new factorization. 
If you review! the proof of the Fundamental Theorem of Arithmetic in Chap- 
ter 7, you will see that the decisive property of primes, from which all else naturally 


flows, is the following simple assertion: 


If a prime divides a product 
of two numbers, then it divides 
(at least) one of the numbers. 


'So what are you waiting for, go back and review! 
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Luckily for us, the Gaussian integers also have this property, but before giving the 
proof, we need to know that when we divide one Gaussian integer by another the 
remainder is smaller than the number with which we’re dividing. 

This is so obvious for ordinary integers that you probably wouldn’t think it was 
worth mentioning. For example, if we divide 177 by 37, we get a quotient of 4 and 
a remainder of 29. In other words, 


177 = 4- 37+ 29, 


and the remainder 29 is smaller than the divisor 37. 

However, matters are far less clear for the Gaussian integers. For example, if 
we divide 237 + 5042 by 15 — 177, what are the quotient and remainder, and how 
can we even talk about the remainder being smaller than the divisor? The answer 
to the second question is easy; we measure the size of a Gaussian integer a + bi 
by its norm N(a + bi) = a” + b? so we can ask that the remainder have smaller 
norm than the divisor. But is it possible to divide 237 + 5047 by 15 — 177 and get 
a remainder whose norm is smaller than N(15 — 171) = 514? The answer is Yes 
since 

237 + 5047 = (—10 + 232)(15 — 177) + (—4 — 11%). 


This says that 237 + 5047 divided by —10 + 237 gives a quotient of 15 — 171 
and a remainder of —4 — 11i, and clearly N(—4 — 11i) = 187 is smaller than 
N(15 — 177) = 514. 

We now prove that it is always possible to divide Gaussian integers and get a 
small remainder. The proof is a pleasing blend of algebra and geometry. 


Theorem 36.2 (Gaussian Integer Division with Remainder). Let a and 3 be Gaus- 
sian integers with 8 # 0. Then there are Gaussian integers y and p such that 


a=By+p and N(p) <N(S). 
Proof. If we divide the equation we’re trying to prove by (7, it becomes 


aout g with n($) <1. 
This means that we should choose 7¥ to be as close to a/(3 as possible, since we 
want the difference between y and a/{ to be small. 

If the ratio a/@ is itself a Gaussian integer, then we can take y = a/ and 
p = 0; but in general a/@ is not a Gaussian integer. However, it is certainly a com- 
plex number, so we can mark it in the complex plane as illustrated in Figure 36.1. 
We next tile the complex planes into square boxes by drawing vertical and hori- 
zontal lines through all the Gaussian integers. The complex number a//( lies in 


[Chap. 36] The Gaussian Integers and Unique Factorization 284 


one of these squares, and we take 7¥ to be the closest corner of the square that con- 
tains a/3. Note that 7 is a Gaussian integer, since the corners of the squares are 
the Gaussian integers. 


Figure 36.1: Closest Gaussian Integer y to the Quantity a/( 


The farthest that a/2 can be from ¥ occurs if a/( is exactly in the middle of a 


square, sO 
2 
(distance from a/ to y) < at) 


(The diagonal of the square has length \/2, so the middle of the square is half 
of \/2 from the corners.) If we square both sides and use the fact that the norm is 
the square of the length, we obtain 


a ih 
N(B-9) <3 


Next we multiply both sides by N(3) and use the Norm Multiplication Property to 
obtain 


N(a- By) < 5N(B). 


Finally, we simply choose p to be p = a — (4, and then we get the desired prop- 
erties: 


a=fy+p and N(p) <N(Q). 
[In fact, we get the stronger inequality N(p) < 4 N().] El 


The next step is to use Gaussian Integer Division with Remainder to show that 
the “smallest” nonzero number of the form Aa + B® divides both a and £. It is 
instructive to compare this with the analogous property of ordinary integers that 
we proved in Chapter 6. 
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Theorem 36.3 (Gaussian Integer Common Divisor Property). Let a and £ be 
Gaussian integers, and let S' be the collection of Gaussian integers 


Aa + BB, where A and B are any Gaussian integers. 
Among all the Gaussian integers in S, choose an element 
g=aa+b6 
having the smallest nonzero norm. In other words, 


for any Gaussian integers A 


0<N(g)<N(Ao+ BS) a B with Ao + BBO. 


Then g divides both a and £. 


Proof. We use Gaussian Integer Division with Remainder to divide a by g, 
a=g9y+p with 0 < N(p) < N(g). 


Our goal is to show that the remainder p is zero. 
Substituting g = aa + bf into a = gy + p and doing a little algebra yields 


(1 —ay)a — byB = p. 
Thus p is in the set S, since it has the form 
(Gaussian integer times ~) + (Gaussian integer times (). 


On the other hand, N(p) < N(g), and we chose g to have the smallest nonzero 
norm among the elements of S. Therefore, N() must equal 0, which means that 
p = 0. This shows that a = g7, so g divides a. 

Finally, reversing the roles of a and £ and repeating the argument shows that g 
also divides (. C] 


Now we’re ready to show that if a Gaussian prime divides a product of two 
Gaussian integers, then it divides at least one of the two. 


Theorem 36.4 (Gaussian Prime Divisibility Property). Let 7 be a Gaussian prime, 
let a and 2 be Gaussian integers, and suppose that 7 divides the product a8. Then 
either 7 divides a or x divides 8 (or both). 

More generally, if x divides a product a1Q2--+-Qn, of Gaussian integers, then 
it divides at least one of the factors Q4, Q2,..., Qn. 
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Proof. We apply the Gaussian Integer Common Divisor Property to the two num- 
bers a and 7. This tells us that we can find Gaussian integers a and b such that the 
number 

g=aa+bm divides both a and 7. 


But 7 is a prime, so the fact that g divides 7 means either that g is a unit or else g 
is equal to 7 times a unit. We consider these two cases separately. 

First, suppose that g = um for some unit u; that is, u is one of the num- 
bers 1, —1, 7, or —2. Since we also know that g divides a, it follows that 7 di- 
vides a, sO we are done. 

Second, suppose that g itself is a unit. We multiply the equation g = aa + br 
by £ to get 

gB = aaB + brB. 
We are told that 7 divides a, so this equation tells us that 7 divides g{. Since g is 
a unit, it follows that 7 divides 8, so again we are done. This completes the proof 
that if a prime 7 divides a product a, then it divides at least one of the factors a 
or (. 

This proves the first part of the Gaussian Prime Divisibility Property. For the 
second part we can use induction on the number n of factors. We have proved the 
case n = 2 (i.e., two factors @1@2), which is enough to get our induction started. 
Now suppose that we have proved the Gaussian Prime Divisibility Property for 
all products having fewer than n factors, and suppose that 7 divides a product 
Q@1Q2°--Qn having n factors. If we let a = a1---Qn_; and 8 = ap, then 7 
divides a8, so we know from above that either 7 divides a or 7 divides 8. If 7 
divides 3, then we’re done, since 8 = a. On the other hand, if 7 divides a, then 7 
divides the product aj ---Q@,_— 1 consisting of n — 1 factors, so by the induction 
hypothesis we know that 7 divides one of the factors a1,...,@n—1. This completes 
the proof of the Gaussian Prime Divisibility Property. C 


We are finally ready to prove that every nonzero Gaussian integer has a unique 
factorization into primes. 


Proof of Unique Factorization of Gaussian Integers. We begin by demonstrating 
that every Gaussian integer has some factorization into primes. We could sim- 
ply mimic the proof we gave in Chapter 7, but for the sake of variety, we instead 
give a proof by contradiction. (To refresh your memory of how these work, see 
the proof of Theorem 8.2 on page 60.) We begin the proof by assuming that the 
following statement is true, and we use this assumption to deduce a contradiction, 
which then lets us conclude that the statement is false. 


There exists at least one nonzero Gaussian 


Statement: < . 
eee that does not factor into primes. 
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Among the nonzero Gaussian integers with this property, we choose one (call it a) 
having smallest norm. We can do this, since the norms of nonzero Gaussian in- 
tegers are positive integers, and any collection of positive integers has a smallest 
element. Notice that a cannot itself be prime, since otherwise a = a is already 
a factorization of a into primes. Similarly, a cannot be a unit, since otherwise 
@ = a would again be a factorization into primes (in this case, into zero primes). 
But if @ is neither prime nor a unit, then it must factor a = (7 into a product of 
two Gaussian integers, neither of which is a unit. 

Now consider the norms of / and y. Since ( and 7¥ are not units, we know that 
N(8) > 1 and N(y) > 1. We also have the multiplication property N(3) N(7) = 
N(q), so 


But we chose a to be the Gaussian integer of smallest norm that does not factor 
into primes, so both @ and ¥ do factor into primes. In other words, 


/ 


B=mne:++T and y=mym- 7, 


for certain Gaussian primes 71,...,7;-,7,.--,7,- But then 


/ 


tf 
a = Py = 119° + + WpT 17° Ws 


is also a product of primes, which contradicts the choice of a as a number that 
cannot be written as a product of primes. This proves that our Statement must 
be false, since it leads to the absurdity that a both does and does not factor into 
primes. In other words, we have proved that the statement “there exist nonzero 
Gaussian integers that do not factor into primes” is false, so we have proved that 
every nonzero Gaussian integer does factor into primes. 

The second part of the theorem requires us to show that the factorization into 
primes can be done in only one way, subject to the caveats already described. Again 
we could mimic the proof in Chapter 7, but instead we use a proof by contradiction. 
We start with the following statement: 


There exists at least one nonzero Gaussian 


Statement: 4. : aaa eons : 
nae with two distinct factorizations into primes. 


Assuming the truth of this statement, we look at the set of all Gaussian integers 
having two distinct factorizations into primes (the statement assures us this set is 
not empty), and we take a to be an element of the set having the smallest possible 
norm. 
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This means that a has two different factorizations 


/ 


| a | 
O = UMM °°* Tp = UMM °* Ts, 


where the primes are normalized as described at the beginning of this chapter. 
Clearly, a cannot be a unit, since otherwise we would have a = u = uw’, so the 
factorizations would not be different. This means that r > 1, so there is a prime 771 
in the first factorization. Then 77; divides a, so 


/ 


m™ divides the product u!7{74--+ 74. 


The Gaussian Prime Divisibility Property tells us that 7 divides at least one of 
the numbers wu’, 7},...,7,. It certainly doesn’t divide the unit u’, so it divides 
one of the factors. Rearranging the order of these other factors, we may assume 
that 77; divides 7. However, the number 7/, is a Gaussian integer prime, so its only 


divisors are units and itself times units. Since 77 is not a unit, we deduce that 
1 = (unit) x 7}. 
Furthermore, both 71 and 7} are normalized, so the unit must equal 1 and 7, = 7}. 


Let 6 = a/m, = a/7}. Canceling 7, from the two factorizations of a yields 


| ee / 
B= UNMQ +++ Tr = UTM *** TH. 


This number £ has the following two properties: 
e N(3) = N(a)/N(z) < N(a). 


e 3 has two distinct factorizations into primes (since a has this property, and 
we canceled the same factor from both sides of the two factorizations of a). 


This contradicts the choice of a as the smallest number with two distinct factor- 
izations into primes, and hence our original statement must be false. Thus, there 
do not exist any Gaussian integers with two distinct factorizations into primes, so 
every Gaussian integer has a unique such factorization. L 


We use the Gaussian Integer Unique Factorization Theorem to count how many 
different ways a number can be written as a sum of two squares. For example, 
how many ways can the number 45 be written as a sum of two squares? A little 
experimentation quickly yields 


45 = 37 +67, 
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and this is the only way to write 45 as a? + b? with a and b positive and a < b. Of 
course, we could switch the two terms to get 45 = 62 + 32, and we could also use 


negative numbers, for example, 


45 =(-3)?+6? and 


45 = (—6)? + (—3)?. 


It is convenient to count all of these as different. So we say that 45 can be written 
as a sum of two squares in eight different ways: 


45 = 32 + 6? 
45 = (—3)2 + 6? 
45 = 32 + (-6)? 


45 = (—3)? + (-6) 


In general, we write 


45 = 62 + 32 
45 = 62 + (—3)? 
45 = (—6)? + 3? 


45 = (—6)? + (-3? 


R(N) = number of ways to write N as a sum of two squares. 


This is also known as the number of representations of N as a sum of two squares, 
which explains the nomenclature. Our example says that 


R(45) = 8. 
Similarly, R(65) = 16, since 

65 = 12 + 8? 65 = 8* + 17 

65 = (—1)? + 8? 65 = 87 + (-1)? 

65 = 17 + (-8)? 65 = (—8)? + 12 

65 = (-1)?+(-8)? 65 = (-8)? + (-1)? 
65 = 42 4°77 65 = 774 4? 

65 = (—4)? +7? 65 = 72 + (—4)? 

65 = 42 + (—-7)? 65 = (-7)? +42 


65 = (—4)? + (-7)? 


65 = (—7)? + (—4)?. 


The following beautiful theorem gives a surprisingly simple formula for the 
number of representations of an integer NV as a sum of two squares. 


Theorem 36.5 (Sum of Two Squares Theorem (Legendre)). For a given positive 


integer N, let 


y= (the number of positive integers d dividing N that satisfy d = 1 (mod 4)), 
DS (the number of positive integers d dividing N that satisfy d = 3 (mod 4)). 
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Then N can be written as a sum of two squares in exactly 
R(N) = 4(D; — D3) ways. 


Before giving the proof of Legendre’s formula, we illustrate the theorem with 
the number N = 45. The divisors of 45 are 


ieee ebsites Esta lo 


Four of these divisors (1, 5, 9, 45) are congruent to 1 modulo 4, so D; = 4, while 
two of the divisors (3 and 15) are congruent to 3 modulo 4, so D3 = 2. The 
theorem says that 


R(45) = 4(D, — D3) = 4(4 — 2) =8, 


which agrees with our earlier calculation. Similarly, the number 65 has the four 
divisors 1, 5, 13, and 65, all of which are congruent to 1 modulo 4. Thus the 


theorem predicts that 
R(65) = 4(4 — 0) = 16, 


again agreeing with the preceding calculation. 


Proof of Legendre’s Sum of Two Squares Theorem. The proof has two steps. First 
we find a formula for R(V). Next we find a formula for D; — D3. Comparing the 
two formulas completes the proof. 

Although the proof is not very hard, it may seem complicated because of the 
notation. So we first explain how to use Gaussian integers to compute R(NV) for 
a particular number N. If you can follow the proof for this value of NV, then you 
should have no trouble with the general proof. 

We use the number N = 28949649300. We begin by factoring N into a prod- 
uct of ordinary primes and grouping together the primes that are congruent to 1 
modulo 4 and the ones congruent to 3 modulo 4, 


N= 28949649300 = 2", (G7 -13°). 67117) 
—— <a 
(1 mod 4 primes) (3 mod 4 primes) 


Next we factor N into a product of Gaussian primes. Using the facts that 2 = 
—i(1 + 1), the primes congruent to 1 modulo 4 factor into products of conjugate 
Gaussian primes, and the primes congruent to 3 modulo 4 are already Gaussian 
primes, we obtain the factorization 


N =—-(1+14)*- ((2 + 4)?(2 — 4)? - (2 + 31)9(2 — 31)°) - (8? - 114). 
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Now suppose that we want to write N as a sum of two squares, say N = A* + B?. 
This means that 
N = (A+ Bi)(A —- Bi), 


so by unique factorization of Gaussian integers, A + Bi is a product of some of 
the primes dividing NV, and A — Bi is the product of the remaining ones. 

However, we don’t have complete freedom in distributing the primes divid- 
ing N, because A + Bi and A — Bi are complex conjugates of one another. That 
is, changing 7 to —7z changes one into the other. This means that if some prime 
power (a + bi)® divides A + Bi, then the conjugate prime power (a — bi)® must 
divide A — Bi. So, for example, if (2 + i)? divides A + Bi, then (2 — i)” divides 
A — Bi, so there won’t be any 2 — i factors left to divide A + Bi. 

This reasoning also applies to the Gaussian primes congruent to 3 modulo 4. 
Thus we can’t have 9 dividing A + Bi, since then there wouldn’t be any factors 
of 3 left over to divide A — Bi. These observations show that the factors A + Bi 
of N = 28949649300 must look like 


ABs =anite (11)? O47) (2s) = 80) 8 11, 


where we can take any 0 < n < 2 and any 0 < m < 3. There are thus 3 choices 
for n, there are 4 choices for m, and there are the usual 4 choices of the unit, so 
there are 4 - 3 - 4 = 48 possibilities for A + Bi. The unique factorization property 
of Gaussian integers tells us that writing NV as a sum of two squares is exactly the 
same problem as finding an A + Bi dividing N, so we conclude that R(NV) = 48. 
But it is important to keep in mind that this number 48 is really the product of the 
following three quantities: 


e the number of units in the Gaussian integers 
e one more than the exponent of 2 + 7 
e one more than the exponent of 2 + 32 


We now begin the proof of Legendre’s Sum of Two Squares Theorem. We begin 
by factoring N into a product of ordinary primes 
N = 2! Pi Po ++ * De . qi qf... qf , 
——Z— ee’ 
(1 mod 4 primes) (3 mod 4 primes) 


where p1,..., Pr are congruent to 1 modulo 4, and qi,...,qs are congruent to 3 
modulo 4. We use Gaussian integers to find a formula for R(V) in terms of the 
EXPONEHM(S: 2.5 ers J 1a shes 
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We factor N into a product of Gaussian primes. The integer 2 factors as 
2 = —i(1 +7)”, and each p; factors as 


pj = (a; + b;t)(a,; — b,i), 
while the q; are themselves Gaussian primes. This gives the factorization 
N = (—i)'(1 + )**((aq + b12) (a1 — bit) ((ag + bai) (a2 — bai))™ 
++ ((dp + Bri) (ap — byt) qf gg? «+ gf. 


If any of the exponents f;,...,f, is odd, then we know that N cannot be 
written as a sum of two squares, so R(N’) = 0. So we now suppose that all of 
fi,.--, fs are even, and we suppose that NV is written as a sum of two squares, say 
N = A? + B?. This means that 


N = (A+ Bi)(A— Bi), 


so A + Bi and A — Bi are composed of the prime factors of N. Furthermore, 
since A+ Bi and A — Bi are complex conjugates of one another, each prime that 
appears in one of their factorizations must have its complex conjugate appearing in 
the other. This means that A + Bi looks like 


A+ Bi=u(1 +4)’ ((a1 + bi)" (a1 — byt)2-™) 
+ ((dp + byt)” (ap — ba) RP gl Ag? 7 . gfs/2, 
where wu is a unit and the exponents 71,..., x, satisfy 
cite 10 aoe eos. OVS ee 


Taking the norm of both sides expresses N as a sum of two squares, so counting 
the number of choices for the exponents, we find that this gives 


A(ey + 1)(e2 +: 1)-+- (er +1) 


ways to write N as a sum of two squares. (We leave it as an exercise for you to 
check that different choices of u,21,...,2, yield different values of A and B.) 
To recapitulate, we have proved that if the integer N is factored as 


N = 2p sie . per ql ete qi: 
with p1,..., pr all congruent to 1 modulo 4 and q;,..., qs all congruent to 3 mod- 


ulo 4 then 
R(N) = 


The proof of Legendre’s Sum of Two Squares Theorem will thus be complete once 
we prove that the difference D; — D3 is given by the same formula. 


A(ey +1)(e2+1)---(e- +1) if fi,..., fs are all even, 
0 if any of f1,..., f; is odd. 
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Theorem 36.6 (Difference of D; — D3 Theorem). Factor the integer N into a 
product of ordinary primes as 


N = 2! pip? sper « gig? gh. 
eae 


(1 mod 4 primes) (3 mod 4 primes) 
Let 


Dy, = (the number of integers d dividing N that satisfy d = 1 (mod 4)), 
D3 = (the number of integers d dividing N that satisfy d = 3 (mod 4)). 


Then the difference D, — D3 is given by the rule 


Di-De= (e1 +1)(e2+1)---(e- +1) ff fi,..., fs are all even, 
Par a if any of fi,..., fa is odd. 


Proof. We give a proof by induction on s. First, if s = 0, then N = 2'pj1 --- pe, 
so every odd divisor of N is congruent to 1 modulo 4. In other words, D3 = 0, 
and D, is the number of odd divisors of VV. The odd divisors of N are the numbers 
p+: per with each exponent u, satisfying 0 < u; < e;. There are thus e; + 1 
choices for u;, which means that the total number of odd divisors is 


D, — D3 = Dy = (e1 + 1)(e2 +1)--+ (e, +1). 


This completes the proof if s = 0, that is, if N is not divisible by any 3 modulo 4 
primes. 

Now let N be divisible by q for some prime g = 3 (mod 4), and assume that 
we have completed the proof for all numbers having fewer 3 modulo 4 prime divi- 
sors than NV. Let g/ be the highest power of q dividing NV, so N = qin with f > 1 
and q { n. We consider two cases, depending on whether f is odd or even. 

First, suppose that f is odd. The odd divisors of N are the numbers 


qd ~~ with0 <i < f andd odd and dividing n. 


Thus each divisor d of n gives rise to exactly f + 1 divisors of N, that is, to the 
divisors q‘d with 0 < i < f; and of these f +1 divisors of N, exactly half are 
congruent to 1 modulo 4 and exactly half are congruent to 3 modulo 4. Thus the 
divisors of N are equally split among D,; and D3, so we have D; — D3 = 0. This 
completes the proof in the case that N is divisible by an odd power of a 3 modulo 4 
prime. 

Second, suppose that N = q/n with f even. Again the odd divisors of N 
look like q‘d with 0 < i < f and d odd and dividing n. If we only consider 
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divisors q‘d with exponents 0 < i < f — 1, then the same reasoning as before 
shows that the number of 1 modulo 4 divisors is exactly the same as the number 
of 3 modulo 4 divisors, so they cancel out in the difference D,; — D3. So we are 
left to consider the divisors of N of the form g/d. The exponent f is even, so 
q! = 1 (mod 4). This means that g/d counts in D, if d = 1 (mod 4) and it counts 
in Dg if d= 3 (mod 4). In other words, 


(D, for N) — (D3 for N) = (D, for n) — (D3 for n). 


Our induction hypothesis tells us that the theorem is true for n, so we deduce that 
the theorem is also true for NV. This completes the proof of the D; — D3 Theorem. 
O 


Exercises 


36.1. (a) Leta = 2+ 3%. Plot the four points a, 1a, —a, —ia in the complex plane. 
Connect the four points. What sort of figure do you get? 

(b) Same question with a = —3 + 4%. 

(c) Let a = a + bi be any nonzero Gaussian integer. Let A be the point in the complex 
plane corresponding to a, let B be the point in the complex plane corresponding 
to ia, and let O = (0,0) be the point corresponding to 0. What is the measure of the 
angle ZAOB? That is, what is the measure of the angle made by the rays OA and 
OB? 

(d) Again let a = a + bi be any nonzero Gaussian integer. What sort of shape is formed 
by connecting the four points a, 1a, —a, and —ia@? Prove that your answer is correct. 


36.2. For each of the following pairs of Gaussian integers a and (, find Gaussian integers 
y and p satisfying 
a=fBy+p and N(p) <N(f). 


(a) a=114+17i, B=5+4+3% 
(b) o=12-231, B=7—5i 
(c) a=21-201, B=3-T7i 


36.3. Let a and ( be Gaussian integers with 3 4 0. We proved that we can always find a 
pair of Gaussian integers (7, ) that satisfy 


a=fy+p and = N(p) < N(A). 


(a) Show that there are actually always at least two different pairs (7, ¢) with the desired 
properties. 
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(b) Can you find an a and £ with exactly three different pairs (7, 0) having the desired 
properties? Either give an example or prove that none exists. 

(c) Same as (b), but with exactly four different pairs (7, p). 

(d) Same as (b), but with exactly five different pairs (7, p). 

(e) Illustrate your results in (a), (b), (c), and (d) geometrically by dividing a square into 
several different regions corresponding to the value of a/. 


36.4. Let a and 8 be Gaussian integers that are not both zero. We say that a Gaussian 
integer 7 is a greatest common divisor of a and [3 if (i) y divides both a and (, and 
(ii) among all common divisors of a and (, the quantity N(7) is as large as possible. 
(a) Suppose that +y and 6 are both greatest common divisors of @ and 3. Prove that 
divides 6. Use this fact to deduce that 6 = uy for some unit wu. 
(b) Prove that the set 


{ar + 6s : rand s are Gaussian integers} 


contains a greatest common divisor of a and (@. [Hint. Look at the element in the set 
that has smallest norm. ] 
(c) Let y be a greatest common divisor of a and (@. Prove that the set in (b) is equal to 
the set 
{yt : t is a Gaussian integer}. 


36.5. Find a greatest common divisor for each of the following pairs of Gaussian integers. 
(a) a= 8+ 382 and B=9+59 
(b) a=-94+191 and B=-194+4i 
(c) a= 40+60i and B=117 -— 262 
(d) a=16-—120t and B=52+ 682 


36.6. Let FR be the following set of complex numbers: 
R= {a+biv5 : aand bare ordinary integers}. 


(a) Verify that R is a ring. That is, verify that the sum, difference, and product of ele- 
ments of Ff are again in R. 

(b) Show that the only solutions to a8 = 1in Rarea = 6 = 1landa = 6 = -1. 
Conclude that 1 and —1 are the only units in the ring R. 

(c) Let a and ( be elements of R. We say that 8 divides a if there is an element 7 in R 
satisfying a = By. Show that 3 + 2iv/5 divides 85 — 11i/5. 

(d) We say that an element a of R is irreducible’ if its only divisors in R are +1 and +a. 
Prove that the number 2 is an irreducible element of FR. 


*More generally, an element a whose only divisors are u and ua with u a unit is called an 
irreducible element. The name prime is reserved for an element a@ with the property that if it divides 
a product, then it always divides at least one of the factors. For ordinary integers and for the Gaussian 
integers, we proved that every irreducible element is prime, but this is not true for the ring F in this 
exercise. 
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(e) We define the norm of an element a = a + biv’5 in R to be N(a) = a? + 5b?. Let 
a = 11+ 2i/5 and 6 = 14+ iV5. Show that it is not possible to find elements 7 
and p in F satisfying 


a=Byt+p and — N(p) <N(8). 


Thus F does not have the Division with Remainder property. [Hint. Draw a picture 
illustrating the points in R and the complex number a/(3.] 
(f) The irreducible element 2 clearly divides the product 


(1+iv5)(1—iv5) =6. 


Show that 2 does not divide either of the factors 1 + i/5 or 1 —iV/5. 
(g) Show that the number 6 has two truly different factorizations into irreducible ele- 
ments of f by verifying that the numbers in the factorizations 


6=2-3=(1+iv5)(1 —iv5) 


are all irreducible. 

(h) Find some other numbers a in RF that have two truly different factorizations a = 
11% = 7374, where 71, 72, 73, 74 are distinct irreducible elements of R. 

(i) Can you find distinct irreducibles 71, 72, 73,74, 75,7 in R with the property that 
1112 = 1384 = WHTG? 


36.7. During the proof of Legendre’s Sum of Two Squares Theorem, we needed to know 
that different choices of the unit u and the exponents 7;,...,2, in the formula 


A+ Bi=u(l+ i)’ ((a + b11)** (a1 — byt)! ) 
+++ ((dy + Bpi)** (ap — bpd)? 2") ght /? gh2/? ... gfs/? 


yield different values of A and B. Prove that this is indeed the case. 


36.8. (a) Make a list of all the divisors of the number N = 2925. 
(b) Use (a) to compute D, and Ds, the number of divisors of 2925 congruent to 1 and 3 
modulo 4, respectively. 
(c) Use Legendre’s Sum of Two Squares Theorem to compute R(2925). 
(d) Make a list of all the ways of writing 2925 as a sum of two squares and check that it 
agrees with your answer in (c). 


36.9. For each of the following values of N, compute the values of D; and D3, check your 
answer by comparing the difference D; — D3 to the formula given in the D; — D3 Theorem, 
and use Legendre’s Sum of Two Squares Theorem to compute R(V). If R(N) 4 0, find 
at least four distinct ways of writing N = A? + B? with A> B > 0. 

(a) N = 327026700 

(b) N = 484438500 


Chapter 37 


Irrational Numbers 
and Transcendental Numbers 


In the historical development of numbers and mathematics, fractions (also called 
rational numbers since they are ratios) appeared quite early, having been used in 
ancient Egypt as early as 1700 BCE. Rational numbers come up very naturally 
as soon as a civilization needs to subdivide land or cloth or gold or whatever into 
pieces. Fractions also appear when two quantities are compared. To take a con- 
crete example, the distance from Cairo to Luxor is more than twice as far as the 
distance from Cairo to Alexandria, but less than three times as far. Such a state- 
ment is helpful, but not particularly precise. On the other hand, for most practical 
purposes it suffices to say that the former distance is 17/6 times the latter distance. 
This means that 6 times the distance from Cairo to Luxor is equal to 17 times the 
distance from Cairo to Alexandria. We say that two quantities are commensurable 
if a nonzero integer multiple of the first is equal to a nonzero integer multiple of 
the second; or, equivalently, if their ratio is a rational number. Notice that this is 
how we measure distances today. When we say that it is 3.7 miles to the center of 
town, what we really mean is that 10 times the distance to the center of town is the 
same as 37 times the length of an idealized distance called a “mile.” 

For a very long time, people who gave the matter any thought seem to have 
assumed that every number is a rational number. In geometric terms, they assumed 
that any two distances were commensurable. The first indication that this might not 
be true appeared in Greece about 2500 years ago. Ironically, nonrational numbers 
made their debut in the Pythagorean Theorem, that gem of classical mathematics 
about which we have already waxed poetic in Chapter 2. Although the Pythagorean 
Theorem was known long before the time of Pythagoras, it was in ancient Greece 
that someone (possibly Pythagoras himself) first observed that the hypotenuse of 
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an isosceles right triangle (see Figure 37.1) is not commensurable with the sides. 

For example, the Pythagorean Theorem tells us that an isosceles right triangle 
whose sides have length 1 has a hypotenuse of length V2. It is not known exactly 
how the Pythagoreans deduced that the sides and the hypotenuse of such a triangle 
are incommensurable, but the following elegant proof of the irrationality of \/2 is 
adapted from the 10 book of Euclid’s Elements. 


J2°+8 


S 


Figure 37.1: An Incommensurable Hypotenuse 


Theorem 37.1 (Irrationality of \/2 Theorem). The square root of 2 is irrational. 
That is, there is no rational number r satisfying r? = 2. 


Proof. We assume that there does exist a rational number r satisfying r? = 2, and 
we use the supposed existence of r to end up with a contradictory statement, that is, 
with a statement that is clearly false. This contradiction shows that such an r does 
not exist. As noted in Chapter 36, this method of proof by contradiction (reductio 
ad absurdum in Latin) is a powerful tool in the mathematical arsenal. 

Now for the details. We assume that r is a rational number satisfying r? = 2. 
Since r is a rational number, we can write r as a fraction r = a/b, and since we can 
always cancel factors that are common to the numerator and denominator, we can 
assume that a and 6 are relatively prime. In other words, we write r as a fraction in 
lowest terms. 

The assumption that r? = 2 means that 


a” = 207. 


In particular, a” is even, so a must be even, say a = 2A. If we substitute this in 


and cancel 2 from each side, we get 


2A? = Bb’, 
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so 6 must also be even. But a and 0 are relatively prime, so they can’t both be 
even, which gives us the desired contradiction. Since the existence of r leads to a 
contradiction, we are forced to conclude that r cannot exist. Therefore, there is no 
rational number whose square equals 2. L 


This proof of the irrationality of /2 can be generalized in many ways. For 
example, let’s prove that if p is any prime then ,/p is irrational. As before, we 
assume that there is a rational number r satisfying r? = p and try to deduce a 
contradiction. Writing r = a/b as a fraction in lowest terms, we obtain 


a = pe, 


so p divides a?. 

In Chapter 7 we showed that if a prime divides a product of two numbers then 
it must divide at least one of the numbers. In this case, the prime p divides the 
product a - a, so we conclude that p divides a, say a = pA. Substituting and 
canceling p gives 

Pp: A? ai be 
so by the same reasoning we deduce that p divides 6. Thus p divides both a and 6, 
which contradicts the fact that a and 0 are relatively prime. Therefore, r cannot 
exist, which completes the proof that ,/p is irrational. 


Philosophical Interlude. The method of proof by contradiction (reductio ad ab- 
surdum) is based on the principle that if a statement leads to a false conclusion 
then the original statement is itself false. Although common sense says that this 
principle is valid, it actually depends on the underlying assumption that the orig- 
inal statement must be either true or false. The assumption that every statement 
is either true or false is called the Law of the Excluded Middle, and despite its 
grand-sounding name, the Law of the Excluded Middle is really an assumption (in 
mathematical terms, an axiom) that is used in the formal construction of mathe- 
matical systems.! Some mathematicians and logicians do not accept the Law of 
the Excluded Middle and have constructed mathematical theories without using 
proofs by contradiction. 


‘Actually, life is even more complicated than indicated in our brief philosophical digression. Kurt 
Gédel proved in the 1930s that any “interesting” mathematical system (for example, the theory of 
numbers) contains statements that are undecidable, which means that they are neither provably true 
nor provably false within the given mathematical system. A mind-twisting challenge for you is to 
try to imagine how one proves that certain statements cannot be proved! A further philosophical 
conundrum: Can a statement be true even if it is not possible to prove that it is true? What does 
“true” mean? If you believe that absolute mathematical knowledge already exists and is merely 
discovered, rather then created, by mathematicians (see Exercise 35.2 and the footnote on page 268), 
then in some sense, isn’t every statement either true or false? 
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When we say that \/2 is irrational, we are really asserting that the polynomial 
6, See. 


has no rational roots, and similarly the irrationality of ,/p for primes p is the same 
as saying that X? — p has no roots. In general, a polynomial 


Oy -- Po + exe * +-+-+cg1X +¢€q 


with integer coefficients is likely to have many irrational roots, although it is fre- 
quently difficult to figure out what the roots look like. For example, one of the 
roots of the polynomial 


X= 66K — 8x? + 1815. %° — 26610.X° + 5808. X° + 218007.X* 
— 85160.X° — 971388X? + 352176.X + 1742288 


is the horrible-looking number 2 


V1 + 24+ v7. 


There are obviously many polynomials with integer coefficients, and most of 
them have irrational roots. We say that a number is algebraic if it is the root of a 
polynomial with integer coefficients. For example, the numbers 


= V2, 7, sin(/6), andeven V11+ \/ ae iy 

are all algebraic numbers. Note that every rational number a/b is an algebraic 
number, since it is a root of the polynomial bX — a; but, as we have seen, many 
algebraic numbers are not rational numbers. 

Given the seeming abundance of algebraic numbers, we might hope that every 
irrational number is algebraic; that is, we might hope that every irrational number 
is the root of a polynomial having integer coefficients. To take a specific example, 
do you think that the familiar number 7 = 3.1415926... is an algebraic number? 
In the mid-eighteenth century, Leonhard Euler suggested that it is not.> A number 
that is not algebraic is called a transcendental number, because it transcends the 
numbers that are roots of polynomials with integer coefficients. 


*How do you think I found this complicated root of such a huge polynomial? 

Euler wrote (in 1755) that “it appears to be fairly certain that the periphery of a circle constitutes 
such a peculiar kind of transcendental quantity that it can in no way be compared with other quanti- 
ties, either roots or other transcendentals.” Legendre proved in 1794 that x? is irrational and noted 
that “it is probable that the number 7 is not even contained among the algebraic irrationalities . . . but 
it seems to be very difficult to prove this strictly.” 
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Euler and his contemporaries were not able to prove that 7 is transcendental, 
and indeed, it took more than 100 years before F. Lindemann proved the transcen- 
dence of z in 1882. Unfortunately, even with subsequent simplifications, the proof 
that 7 is transcendental is too complicated for us to give here. Actually, it’s not 
easy to show that transcendental numbers exist at all! The first person to exhibit a 
transcendental number was Joseph Liouville in 1840. We follow Liouville’s path 
by taking a particular number and proving that it is transcendental. Liouville’s 
number is given by the nonrepeating decimal 


digit: 12 6 24 120 720 
WY 1 1 + 
B = 0.11000100000000000000000100 - --00100---00100---. 


More precisely, the n™ “one” in the decimal expansion of 8 appears as the n!" 


(that’s n factorial) decimal digit, and all the rest of the decimal digits are zeros. 
Another way to write ( is 


1 1 i. 1 1 
P= 70+ 702 t 708 * ioe 1 igo + oma t 
or, using summation notation for infinite series, 
Sel 
B= d Ton 


To show that ( is transcendental, we need to show that ( is not the root of 
any polynomial having integer coefficients. Just as in the proof of the irrationality 
of \/2, we give a proof by contradiction. Thus we suppose that the polynomial 


F(X) = coXF 4 XO 4 Xt + + cg 1X + ca 


has integer coefficients and f((G) = 0. Liouville’s brilliant idea is that if an irra- 
tional number is the root of a polynomial, then it cannot be too close to a rational 
number. So before studying Liouville’s number, we make a brief detour to discuss 
the question of approximating irrational numbers by rational numbers. 

You may recall Dirichlet’s Diophantine Approximation Theorem (see Chap- 
ter 33), which says that for any irrational number a there are infinitely many ratio- 
nal numbers a/b such that 

a 
S-ol<e 
In other words, we can find lots of rational numbers that are fairly close to a. We 
might ask whether we can get even closer. For example, are there infinitely many 
rational numbers a/b such that 
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The answer depends to some extent on the number a. 

For example, suppose we take « = /2. This means that a is a root of f(X) = 
X? — 2, so if a/b is close to a, then f(a/b) should be fairly small. How can we 
quantify this observation? We can measure the smallness of f (a/b) by factoring 


a a\2 a a 
$) = GY -2= (+9) (§-9) 
f a (5) b + v2 b v2 
If a/b is close to /2, then the first factor a/b + ,/2 is close to OD. so certainly it 
will be smaller than (say) 4. This allows us to estimate 


GIs 415 -¥4) 


On the other hand, we can write 


i(@)- (9) 2-5 


Notice that the numerator a? — 2b? is a nonzero integer. (Why is it nonzero? 
Answer: Because \/2 is irrational, and so cannot equal a/b.) Of course, we don’t 
know the exact value of a? — 2b?, but we do know that the absolute value of a 
nonzero integer must be at least 1.4 Hence 


I G)I= 


We now have an upper bound and a lower bound for | f(a/b)|, and if we put 
them together, we obtain the interesting inequality 


(1) 


which is valid for every rational number a/b. Notice how this inequality comple- 
ments Dirichlet’s inequality 


a? — 2b? 
b2 


1 


ee 


1 a 
eer <4 | palyeae 
m<\5- 


In particular, we can use (1) to show that a stronger inequality such as 
a 1 
cite 2 = 2 
; V2| < 3 (2) 


“The fact we are using here is the seemingly trivial observation that there are no whole numbers 
lying strictly between 0 and 1. Although it seems trivial, this fact lies at the heart of all proofs of 
transcendence. It is equivalent to the fancier-sounding well-ordering property of the nonnegative 
integers, which asserts that any set of nonnegative integers has a smallest element. 
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can have only finitely many solutions. To do this, we combine the inequalities (1) 
and (2) to obtain 


— < te and hence b< 4. 


This means that the only possibilities for 6 are b = 1, 2, 3, and then for each value 
of b, the inequality (2) allows at most a finite number of possible values for a. In 
fact, we find that (2) has exactly three solutions: i =e , t = 2 and t = 3. 

Let’s review what we’ve done. We’ve used the fact that ./2 is a root of the 
polynomial X? — 2 to deduce an inequality (1) that says that a rational number 
a/b cannot be too close to /2. Liouville’s proof that the number ( given above 
is transcendental rests on the following two legs (which might make an unsteady 


table, but is perfectly acceptable for a proof): 


(i) If a is an algebraic number, that is, if a is a root of a polynomial with integer 
coefficients, then a rational number a/b cannot be too close to a. 


(ii) For the number given above, there are lots of rational numbers a/b that are 
extremely close to (. 


Our aim is to take these two qualitative statements and make them precise. We start 
with statement (i), whose quantification takes the following form. 


Theorem 37.2 (Liouville’s Inequality). Let a be an algebraic number; say a is a 
root of the polynomial 


ih) = ox? + axe 16x? +-+-+¢g-1X +g 


having integer coefficients. Let D be any number with D > d (i.e., D is larger 
than the degree of the polynomial f ). Then there are only finitely many rational 
numbers a/b that satisfy the inequality 


a H 
5 -o| Sp (*) 


Proof. The fact that X = a is aroot of f(X) means that when we divide f(X ) by 
X — a, we get a remainder of 0. In other words, f(X ) factors as 


f(X) = (X — a)g(X) 
for some polynomial 


g(X) = Sia 1 ix +-++»-+eq 1X + eg. 
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For example, the algebraic number ‘7 is a root of the polynomial X? — 7, and 
when we divide X? — 7 by X — </7, we obtain the factorization 


X3—7= (X — V7) (X? 4+ V7X + V49). 


Notice that the coefficients e,...,¢€q won’t be integers, but this won’t cause any 
problems for us. 
Suppose now that a/b is a solution to the inequality 


If we substitute X = a/b into the factorization f(X) = (X — a)g(X) and take 
absolute values, we obtain the fundamental formula 


*G)l=lp-al- le G)L 


The importance of this formula is that the right-hand side is small if a/b is close 
to a, while the left-hand side is a rational number. The next two things we need to 
do are find an upper bound for |g(a/b)| and a lower bound for | f(a/b)|. 

We start with the latter. If we write out f(a/b) and put it over a common 
denominator, we obtain 


1(§)-e(§)'4a(§) eel) Peta 


= coa® + cya®—!b + cgat—2b2 + +--+ cy_ab*! + cgbt 
== SS fh 


Note that the numerator of this fraction is an integer, and so as long as it isn’t zero, 


— d 


[We deal later with the case that f(a/b) = 0.] We can illustrate this using our 
example f(X) = X° — 7 from before, 


l= |G) 7 


Next we want an upper bound for g(a/b). The fact that a/b is a solution to the 
inequality (*) certainly implies that 


a? — 7b? 
b3 


it 
BP 


Zz 


a/b] < jo] +1, 
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SO we can estimate 
a eS q\d-1 q |d-2 q\d-3 a 
lo(§)[slel[g] tleal[g] + leal G[ +--+ lexalfg| + lea 
d—1 d—2 d—3 
< lei|(la| +1)" + Jee|(Ja} +. 1)° "+ Jes|(la] +1)°" + --- 
Se lea-1|(la| + 1) + leq| : 
This last quantity is rather messy, but whatever it equals, it has one tremendously 
important property: Jt doesn’t depend on the rational number a/b. In other words, 
we have shown that there is a positive number K such that if a/b is any solution to 


the inequality («), then 
l9(a/)| < K. 


Again we illustrate this estimate with our example f(X) = X° — 7, where we use 
the bound |a/b| < </7 + 1. Thus 


PG) slp 
< (V7 4174+ 777741) + V49 
nengle 


so for this example we could take K = 17.717. 
We now have 


The Inequality («): F — a < is 
A Factorization Formula: | i (=) = F = al lg (5) 
A Lower Bound: | Nf (=) = = 
b b¢ 
An Upper Bound: lg (=) Fe 


Putting them together yields 


aS) l5-4l-e la 


Since we are told that D > d, we can isolate b on the left-hand side to obtain the 
upper bound 

ba OP ®. 
To illustrate this using our example a = ¥/7 and f (X) = X3 — 7, we have d = 3, 
and we found that we can take K = 17.717, so if we take (say) D = 3.5, then we 


obtain the bound 
b < 17.7171/85-3) ~ 313.89. 
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Now you can see why it was so important that the upper bound / not depend 
on the number a/b, since it is this fact that allows us to conclude that there are 
only finitely many allowable values for b. (Note that b is necessarily a positive 
integer, since it is the denominator of the fraction a/b written in lowest terms.) 
Furthermore, for each fixed choice of b, there are only finitely many values of a 
for which the inequality (+) holds. (In fact, if b?~! > 2, then there’s at most one 
allowable a for a given b.) Returning to our example one last time, we see that 
the allowable b’s are the integers 1 < b < 313, and then for each particular 6, the 
corresponding allowable a’s [i.e., those that are solutions to the inequality (*)] are 


those satisfying 
ik 


3/\— 1 3ia 
bV7 — ae Sa < DVT + Ge. 


2-5 


This shows that there are only finitely many solutions, and a quick computation (on 
a computer) reveals that for this example there are only the two solutions a/b = 
L/tand 2/1, 

We have almost completed our proof that the inequality (*) has only finitely 
many solutions a/b. If you review what we’ve done so far, you'll see that what 
we have actually proved is that (+) has only finitely many solutions satisfying 
f(a/b) # 0. Thus we still need to deal with the roots of f(X). But a polyno- 
mial of degree d has at most d roots of any sort, rational or irrational, so the finitely 
many rational roots of f(X) don’t change our conclusion that (*) has only finitely 
many solutions. O 


Liouville’s Inequality says that an algebraic number a cannot be too closely 
approximated by rational numbers. The next Lemma, which is the second leg in 
our proof, says that Liouville’s number ( can be very closely approximated by lots 
of rational numbers. 


Lemma 37.3 (Lemma on Good Approximations to 3). Let 8 be Liouville’s number 
=~ 1 
B ss 2 19”! 
n=1 


as described above. Then for every number D > 1 we can find infinitely many 
different rational numbers a/b that satisfy the inequality 


a I 
Faas Stor 


Proof. Intuitively, Lemma 37.3 says that we can find rational numbers that are 
very, very close to 8. How might we find such good approximations? The defini- 


tion of 
1 1 1 1 1 1 
+—+—mtat + 5a! 


B= son + Goa + jos + goa + Tom 
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provides the clue. The terms in this series are decreasing very rapidly, so if we 
just take the first few terms, we should get a pretty good approximation to 3. For 
example, if we take the first four terms, then we get the rational number 


1 1 1 


Tom F To2 ae io! -F To® ~ 0.110001000000000000000001. 


mh = 
Then |r4 — (| has a decimal expansion whose first 119 decimal digits are all zero, 
so |r4 — B| < 2- 1071, which is certainly very small. On the other hand, if we 
write 74 as a fraction a4/b4, we find that 


_ a4 110001000000000000000001 


te ba 1000000000000000000000000’ 


so its denominator by is “only” 107+. This may seem large, but notice that |r4—3| < 
2-10—120 < 1/3, so rq is a rather good approximation to £. 

More generally, suppose we take the first NV terms in the series and add them 
to form the rational number 


eee pe Oe ards er tae! 
TN tO foe! 108! 1001’ 


We need to estimate the size of by and also how close ry is to (. 
The denominators of the fractions we’re adding to form ry are all powers of 10, 
so the least common denominator is the last one, 


by = 10". 
On the other hand, the difference 6 — ry looks like 


1 il 1 


P— "N= sos + ower + pra 


Thus the first nonzero digit in the decimal expansion of 6 — ry occurs at the 
(N + 1)!" digit, and this digit is a 1. This shows that the difference 8 — ry is 
certainly smaller than the number that has a 2 as its (N + 1)!" digit. In other 
words, 


0<8—TN < Tas: 


To relate this to the value of by, we observe that 


19\4+)! = ie = ae 


so we find that 


2 
LG PTS Saad 
N 
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To recapitulate, for every NV > 1 we have found a rational number ay /by such 


that 
an 2 2 ih 1 


—— SS 
bn beer bn aN by 


Furthermore, these rational numbers are all different, since their denominators 
by = 10™! are different. Hence the rational numbers ay /bn with N > D pro- 
vide infinitely many solutions to the inequality 


a 1 

nce as: 

5 AIS pp 
which completes the proof of Lemma 37.3. L 

We now have the two ingredients needed to prove that { is transcendental. 
Theorem 37.4 (Transcendence of 3 Theorem). Liouville’s number 
“1 
pe Ds 10”! 
n=1 

is transcendental. 


Proof. We give a proof by contradiction, so we start by assuming that ( is actually 
algebraic and try to derive a false statement. The assumption that ( is algebraic 
means that it is a root of a polynomial 


f(X) = eX axe ox? = +-+-+c¢g-1X + Cg 


having integer coefficients. Let D = d+ 1. Then Liouville’s Inequality tells us 
that there are only finitely many rational numbers a/b that satisfy the inequality 


a i 
5-5] < ap 


On the other hand, Lemma 37.3 tells us that there are infinitely many rational 
numbers satisfying this inequality. This contradiction shows that @ cannot be an 
algebraic number, which completes the proof that 3 must be a transcendental num- 
ber. L 


The proof that Liouville’s number is transcendental is not easy, and you are 
to be congratulated at having reached the end of our transcendental expedition. 
But be aware that we have surveyed only a tiny sliver of the vast continent of 
transcendental numbers. 
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One of the most beautiful theorems in transcendence theory was proved inde- 
pendently by A.O. Gelfond and T. Schneider in 1934. They showed that if a is any 
algebraic number other than 0 or 1 and if b is any irrational algebraic number, then 
the number a? is transcendental. For example, the number 2V2 is transcendental. 
Amazingly, the Gelfond—Schneider theorem is true even if a and b are complex 
numbers. Thus the number e” is transcendental,” since e” is equal to (—1)~*. 

Transcendence theory is today an active field of mathematical research with 
many innocuous-sounding open problems. For example, it is not known if the 
number z + e is transcendental; indeed, it is not even known if 7 + e is irrational! 


Exercises 


37.1. (a) Suppose that N is a positive integer that is not a perfect square. Prove that /N 
is irrational. (Be careful not to prove too much. For example, check to make sure that 
your proof won’t show that 1/4 is irrational.) 

(b) Let n > 2 be an integer and let p be a prime. Prove that ¢/p is irrational. 


(c) Letn > 2 and N > 2 be integers. Describe when YN is irrational and prove that 
your description is correct. 


37.2. Let A, B,C be integers with A # 0. Let r; and rg be the roots of the polyno- 
mial Ax? + Bx + C. Explain under what conditions r; and rg are rational. In particular, 
explain why they are either both rational or both irrational. 


37.3. Give an example of a polynomial of degree 3 with integer coefficients having: 
(a) three distinct rational roots. 
(b) one rational root and two irrational roots. 
(c) no rational roots. 
(d) Can a polynomial of degree 3 have two rational roots and one irrational root? Either 
give an example of such a polynomial or prove that none exists. 


37.4. (a) Find a polynomial with integer coefficients that has the number 2 + */3 as one 
of its roots. 
(b) Find a polynomial with integer coefficients that has the number \/5 + i as one of its 
roots, where i = //—1. 


37.5. Suppose that f(X) = X4+¢,X4 14 e.X4-74---+cg_1X +€g is a polynomial 
of degree d whose coefficients c,, c2,...,Cq are all integers. Suppose that r is a rational 
number that is a root of f(X). 

(a) Prove that r must in fact be an integer. 

(b) Prove that r must divide cq. 


Here e = 2.7182818 ... is the base of the natural logarithms. Hermite proved that e is transcen- 
dental in 1873. The equality (—1)~" = e” follows from Euler’s identity e’® = cos(0) + isin(@). 
Putting 0 = 7 gives e’*” = —1, and raising both sides to the —i power gives the desired formula. 


[Chap. 37] Irrational Numbers and Transcendental Numbers 310 


37.6. Use the previous exercise to solve the following problems. 

(a) Find all the rational roots of X° — X4 — 3X% — 2X? —19X — 6. 

(b) Find all the rational roots of X° + 63.X4 + 135X3 + 785X? — 556X — 4148. 
[Hint. You can cut down on the amount of work if, as soon as you find a root r, you 
divide the polynomial by X — r to get rid of that root.] 

(c) For what integer value(s) of c does the following polynomial have a rational root: 
X°4 2X4 —cX3 + 3cX? +3? 


37.7. (a) Suppose that f(X) = coX? + ¢,X4"1 4+ cgXt- 72 +--+ + ca-1X +a is a 
polynomial of degree d whose coefficients cp, c1,C2,..., €q are all integers. Suppose 
that r = a/b is a rational number that is a root of f(X). Prove that a must divide cq 
and that b must divide co. 

(b) Use (a) to find all rational roots of the polynomial 
827 — 102° — 32° + 2424 — 3023 — 332? + 30a + 9. 
(c) Let p be a prime number. Prove that the polynomial pX° — X — 1 has no rational 
roots. 


37.8. Let a be an algebraic number. 

(a) Prove that a + 2 and 2a are algebraic numbers. 

(b) Prove that a + 2 and £q are algebraic numbers. 

(c) More generally, let r be any rational number and prove that a+r and ra are algebraic 
numbers. 

(d) Prove that a + /2 and /2-aare algebraic numbers. 

(e) More generally, let A be an integer and prove that a + VA and VA - a are algebraic 
numbers. 

(f) Try to generalize this exercise as much as you can. 


37.9. The number a = 2+ 3 is a root of the polynomial f(X) = X+—10X?+4+1. 
(a) Find a polynomial g(X ) such that f(X) factors as f(X) = (X — a)g(X). 
(b) Find a number K such that if a/b is any rational number with |a/b — a| < 1 and 
f(a/b) # 0, then |g(a/b)| < K. 
(c) Find all rational numbers a/b satisfying the inequality 


a 1 


(d) If you know how to program, redo (c) with 1/b° replaced by 1/6*°. 


37.10. Let 6, and (2 be the numbers 


Here k is some fixed integer with k > 2. 
(a) Prove that ( is transcendental. (If you find it confusing to work with a general value 
for k, first try to do k = 2. Note that we already did the case k = 10.) 
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(b) Prove that (> is transcendental. 


37.11. Let 63 and (4 be the numbers 


<a ae 
Bs = > nal and Ba = 2, 1010" ° 
n=0 n= 


(a) Try to use the methods of this chapter to prove that (3 is transcendental. At what 
point does the proof break down? 

(b) Prove that (3 is irrational. [Hint. Assume that (3 is rational, say 63 = a/b, and look 
at the highest power of 2 that must divide b.] You may have recognized the famous 
number (3 = e = 2.7182818.... It turns out that e is indeed transcendental, but it 
wasn’t until 33 years after Liouville’s result that Hermite proved the transcendence 
of e. 

(c) Try to use the methods of this chapter to prove that $4 is transcendental. At what 
point does the proof break down? 

(d) Prove that (4 is not the root of a polynomial with integer coefficients of degree 9 or 
smaller. 


37.12. Let a = r/s bea rational number written in lowest terms. 
(a) Show that there is exactly one rational number a/b satisfying the inequality 


|a/b — al < 1/sb. 


(b) Show that the equality |a/b — a| = 1/sb is true for infinitely many different rational 
numbers a/b. 


37.13. (a) Prove that 1/8b? < |a/b — v/10| holds for every rational number a/b. 
(b) Use (a) to find all rational numbers a/b satisfying |a/b — V/10| < 1/6. 


37.14. (a) If N is not a perfect square, find a specific value for K so that the inequality 
K/b? < |a/b — VN| holds for every rational number a/b. (The value of K will 
depend on JN, but not on a or b.) 
Use (a) to find all rational numbers a/b satisfying each of the following inequalities: 
(i) |a/b — V7| < 1/0° 
(ii) |a/b — V5] < 1/0°/8 
(c) QB Write a computer program that takes as input three numbers (N, C, e) and prints 
as output all rational numbers a/b satisfying |a/b — VN| < C/b°. Your program 
should check that N is a positive integer and that C > 0 and e > 2. (If e < 2, your 
program should tell the user that she won’t get to see all the solutions, since there are 
infinitely many!) Use your program to find all solutions in rational numbers a/b to 
the following inequalities: 
@) |a/b — /573| < 1/68 
Gi) |a/b — V19| < 1/0? 
ii) |a/b — V6| < 8/0? 
[You’ll need a moderately fast computer for (111) if you try to do it directly.] 


(b 


— 
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37.15. Determine which of the following numbers are algebraic and which are transcen- 
dental. Be sure to explain your reasoning. You may use the fact that 7 is transcendental, 
and you may use the Gelfond—Schneider theorem, which says that if a is any algebraic 
number other than 0 or 1 and if b is any irrational algebraic number, then the number a? is 
transcendental. [Hint. To keep you on your toes, I’ve thrown one number into the list for 
which the answer isn’t known! ] 


(ay V2" — my V2 = (tan /4)¥?_— ow” 
(e) V7 (f) 7” (g) cos(7/5) (hi); 222) 


37.16. A set S of (real) numbers is said to have the well-ordering property if every subset 
of S has a smallest element. (A subset T' of S has a smallest element if there is an element 
a € T such that a < 6 for every other b € T.) 
(a) Using the fact that there are no integers lying strictly between 0 and 1, prove that the 
set of nonnegative integers has the well-ordering property. 
(b) Show that the set of nonnegative rational numbers does not have the well-ordering 
property by writing down a specific subset that does not have a smallest element. 


Chapter 38 


Binomial Coefficients 
and Pascal’s Triangle 


We begin this chapter with a short list of powers of A + B. 


(A+ By = 1 

(A+B)! = A+B 

(AZ B= A? + QAR Bp? 

(At By = A? +3A°B + 3AB? + B° 
(A+B)? =A? 4A? BR 6A‘ RB? 4 AB 4 B? 


There are many beautiful patterns lurking in this list, some fairly obvious, others 
extremely subtle. Before reading further, you should spend a few minutes looking 
for patterns on your own. 

In this chapter we investigate what happens when the quantity (A + B)” 
multiplied out. It is clear from the above examples that we get an expression that 
looks like 


(At Bt] Atl atte | ABE Ante 
Se) iAeB = 28 lABe= See 


where the empty boxes need to be filled in with some integers. 

Clearly, the first and last boxes are filled in with the number 1, and from the 
examples it appears that the second and next-to-last boxes should contain the num- 
ber n. Unfortunately, it’s not at all clear what should go into the other boxes, but 
our lack of knowledge doesn’t prevent us from giving these numbers a name. The 
integers that appear in the expansion of (A + B)” are called binomial coefficients, 
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because A + B is a binomial (i.e., a quantity consisting of two terms), and the 
numbers we’re studying appear as coefficients when the binomial A + B is raised 
to a power. There are a variety of different symbols commonly used for binomial 
coefficients.! We use the symbol 


(;) = Coefficient of A"-* B* in (A+ B)”. 


So using binomial coefficient symbols, the expansion of (A + B)” looks like 


n A” 4 n A™ 1B 4 n A”-2B24...4 n At-* pF -t.... 4 n B". 
0 1 2 k n 


To study binomial coefficients, it is convenient to arrange them in the form of a 
triangle, where the n*” row of the triangle contains the binomial coefficients ap- 
pearing in the expansion of (A+ B)”. This arrangement is called Pascal’s Triangle 
after the seventeenth-century French mathematician and natural philosopher Blaise 
Pascal. 


(A+ B)°: (0) 

(A+ B)}: Ge. *G@) 

(A+ B)*: Go Gh 2G) 
(A+B): (ee) ae ee) 
(A+ By‘: Ge ae GAG ag) 


The First Five Rows of Pascal’s Triangle 


We can use the list appearing at the beginning of this chapter to fill in the values. 


(A+B): 1 

(A+B): 1 1 
(A+B): 1 2 i 
(A+B): 1 3 3 1 
(A+ B)?: 1 4 6 4 1 


How might we form the next row of Pascal’s Triangle? One method is simply 
to multiply out the quantity (A+.B)° and record the coefficients. A simpler method 


'The binomial coefficient (7) is also called a combinatorial number and assigned the symbol 
me OTe 
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is to take the already known expansion of (A + B)* and multiply it by A + B to 
get (A+ B)°. Thus 


(A+ B)4 A* + 4A°B + 6A?B? + 4AB? + Bt 
x A+B x A +B8B 
(A+ By? A*B + 4A3B? + 6A?B3 + 4AB4 + BS 


A> + 4A4B + 6A°2B? + 4A?2B3 + AB4 
A> + 5A*B + 10A?B? + 10A2B? + 5ABt* + BS 


So the next row of Pascal’s Triangle is 
1 5 10 10 5 1 
We can use this simple idea, 
(A+ B)** =(A+B)-(A+B)", 


to derive a fundamental relationship for the binomial coefficients. If we multiply 
(A+ B)" by A+ Bas before and equate the result with (A + B)”"*+, we find that 


(Gia See Al are eles) Ae 
x A “bt B 
are + Gata + (2 )ABY + Glare 


Ga + are + Qaviet t+ AR 
Cates ae (eh) Are si Ar orBe Bi aah ne (Ape a (Or aa 


Thus, for example, 


haat? jt ie or aa Bed el Os ie and so on 
0 i ee ie 1 a a F 
In general, we get the following fundamental formula: 


Theorem 38.1 (Addition Formula for Binomial Coefficients). Letn > k > 0 be 


integers. Then 
n n n+1 
tee 


The addition formula describes a wonderful property of Pascal’s Triangle: Each 
entry in the triangle is equal to the sum of the two entries above it. For example, 
we found earlier that the n = 5 row of Pascal’s triangle is 


[Chap. 38] Binomial Coefficients and Pascal’s Triangle 316 


Tse [es] } 


so the n = 6 row can be computed by adding adjacent pairs in the n = 5 row, as 
illustrated here: 
[n = 5 Row] 1 4) 10 10 4) 1 
NE ET Na ON 
[n = 6 Row] 1 6 15 20 15 6 1 
This shows that 


(A+ B)® = A® + 6A°B + 15A*B? + 20A°B? + 15A2B* + 6AB? + BP 


without doing any algebra at all! Here is a picture of Pascal’s Triangle illustrating 
the binomial coefficient addition formula. 


[n = 0] 1 
Yi SS 
[n = 1] 1 1 
ONG 
[n = 2] ii 2 1 
CO NP oS 
i= 3) 1 3 3 1 
LOM a ONS ON 
[n = 4] 1 4 6 A 1 
YO Sa NN 
[In = 5] 1 5 10 10 5 1 
FX I Oe NG eS 
[n = 6] 1 6 15 20 15 6 1 


Ze Pe Se NEP NW, ON aay 
Pascal’s Triangle Illustrating the Rule (,”,) + (7) = ("{') 


Our next task is to derive a completely different sort of formula for the bino- 
mial coefficients. This illustrates a simple but amazingly powerful method that 
appears time and again in the development of modern mathematics. Anytime you 
can compute a quantity in two different ways, comparison of the resulting formulas 
yields interesting and useful information. 

To find a new formula for the binomial coefficient ( , we consider what hap- 
pens when we multiply out the quantity 


(A+ B)" =(A+ B\(A+B)(A+B)---(A+B)(A+B). 


[Chap. 38] Binomial Coefficients and Pascal’s Triangle S47 


Let’s start with the particular case of 
(A+B)? =(A4+ B)(A+ B)(A+B). 


The product consists of a bunch of terms formed by choosing either A or B from 
the first factor, then choosing either A or B from the second factor, and finally 
choosing either A or B from the third factor. This gives a total of eight terms. How 
many of those terms are equal to A? The only way to get A® is to choose A from 
every factor, so there is only one way to get A®. 

Next consider the number of ways to get A2B. We can get A”B in the follow- 
ing ways: 


e A from the first and second factors and B from the third factor, 
(A+ B)(A+ B)(A+B); 

e A from the first and third factors and B from the second factor, 
(A+ B)(A + B)(A +B); 

e A from the second and third factors and B from the first factor, 
(A+ B)(A + B)(A + B). 


We have illustrated the three possibilities by highlighting and pointing to the A’s 
and B’s being used in each case. This shows that the coefficient of A? B in (A+B)? 
is 3, so the binomial coefficient Gy is equal to 3. 

Now we generalize this argument to count the number of different ways to get 
A* B”-* in the product 


n factors 


SSS SS 
(AP B)(A+ B)( A+B)? (A+ BY A+ B). 


We can get A* B”—* by choosing A from any k of the factors and then choosing B 
from the remaining n — k factors. So we need to count the number of ways to 
select k of the factors. Let’s make one selection at a time. 

We have n choices for the first factor. Once we’ve made that choice, then there 
are n — 1 factors left from which to make our second choice. After we’ve made 
these two choices, there are n — 2 factors left from which to make our third choice, 
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and so on. There are thus 
n(n — 1)(n —2)---(n-k+1) 

ways to choose k of the factors from among the collection of n factors. 

Unfortunately, we’ve overcounted the number of ways to get A* B”—*, because 
we’ve made our choices in a particular order. To illustrate the problem, we return 
to our n = 3 example. In this case one way to get AB is to take A from the first 
and the third factors, but we’ve counted this choice twice, because we counted it 
once as 


“first choose the first factor, next choose the third factor” 
and we counted it a second time as 
“first choose the third factor, next choose the first factor.” 
Thus the actual number of A* B”-* terms in (A + B)” is 
n(n — 1)(n—2)---(n-—k+1) 


divided by the number of different orders in which we can make our choices. 
Remember we’re making k choices, so the number of different orders for these 
choices is k!, since we can put any of the k choices first, then any of the remain- 
ing k — 1 choices second, and so on. Therefore, the number of ways to get the 
term A* B”-* in the product (A + B)” is equal to 
n(n — 1)(n—2)---(n-k+1) 
k! 


Of course, this is precisely the binomial coefficient vee so we have proved the 
celebrated Binomial Theorem. 


Theorem 38.2 (Binomial Theorem). The binomial coefficients in the expansion 


ee n Me n—1 n 2752 yk, n n 
(A+ By= (fans (“avin s (S)arepe a+ ("\p 


are given by the formula 


RN A 2) ed n! 
k} k! kin — ky 
Proof. We have already done the hard work of proving the first equality. To get 
the second formula, we simply multiply the numerator and denominator of the first 
fraction by (n — k)! to get 

n(n —1)(n—2)---(n-k+1) (n—-k)! n! 


kl “(n—k)! ki(n—k)! = 
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For example, the coefficient of A+B in (A + B)* is equal to 


7 te Oe a ae 
3 3! 6 


As another example, the coefficient of ASB™ in (A + B)?9 is 


19\ 19-18-17-16-15-14-13-12-11-10-9 
fe . 11! 

__ 3016991577600 

~ 39916800 

= 75582. 


Of course, if k is larger than n/2, as in this last example, then it is easier to first 
use the following Binomial Coefficient Symmetry Formula: 


(3) - fees 


The symmetry formula simply says that when (A + B)” is multiplied out, the two 
terms A‘ B"—-* and A"—* B* have the same coefficient. This is clearly true since 
there is nothing to distinguish A and B from one another. Using the symmetry 
formula, we can compute 


19\  /19\ | 19-18-17-16-15-14-13-12 
The = SNARES 8! 
_ 3047466240 


10320 79582. 


Binomial Coefficients Modulo p 


What happens if we reduce a binomial coefficient a modulo p, where p is a prime 
number? Here is what the first few lines of Pascal’s triangle look like modulo 5 and 
modulo 7. 


1 1 
Dore Lod 
te eee Dee 
{oes aes ane | | beaes ames aaa 
LP ee ak lL 4 6) A 
LIE es ara st dees ae ae ol 
el Oy Dig Aaa Le Gr abs 6 Ay be al 
Lean th Oe oie tle oe eal Le OEOe SOR Ue Oe the ok 


Pascal’s Triangle Modulo 5 Pascal’s Triangle Modulo 7 
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Notice that the nm = 5 line of the modulo 5 Pascal triangle is 1 0 0 0 0 1, and 
similarly the n = 7 line of the modulo 7 Pascal triangle is 10000001. This 
suggests that (?) should equal 0 modulo p if 1 < k < p— 1. Having made this 
observation, it is easy to prove, and it gives a wonderfully simple version of the 
binomial theorem modulo p. 


Theorem 38.3 (Binomial Theorem Modulo p). Let p be a prime number. 
(a) The binomial coefficient @) is congruent to 


(Z) = O(modp) ff1<k<p-]l, 
_ lL (mod-p) afk =0 or k: =p. 


(b) For any numbers A and B, we have 
(A+ B)P=A?+ BP (mod p). 


Proof. (a) If k = 0 or k = p, then we know that () = 1. So the interesting 
problem is to find out what happens when k is between 1 and p — 1. Let’s take a 
particular example, say coe and try to understand what’s going on. Our formula 
for this binomial coefficient is 
“65 5 dO 
bye enn aU 
Notice that the number 7 appears in the numerator and that there are no 7’s in the 
denominator to cancel the 7 in the numerator. Thus eo is divisible by 7, which is 
the same as saying that it is congruent to 0 modulo 7. 

This idea works in complete generality. The binomial coefficient (2) is equal 


(ea Sot et a et 
ky * tek] Deh= 2) ee 


Thus (2) has a p in the numerator (provided k > 1), and there are no p’s in the 
denominator to cancel it (provided k < p — 1). Hence @) is divisible by p, so it is 
congruent to 0 modulo p. 

Do you see where we are using the fact that p is a prime? If it weren’t a prime, 
then it might happen that some of the smaller numbers in the denominator could 
combine to cancel part or all of p. Thus our proof does not work for composite 
numbers. In other words, we have not proved that ) = 0 (mod n) for composite 
numbers n. Do you think that this more general statement is true? 
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(b) Using the Binomial Theorem and part (a), it is easy to compute 
(A+ B)P = (5) A? + (") AP B+ @ APB? 


bot ( 4 Arar ( . )pet (?) ae 
p= 2 pa Pp 


= APO Al TB 0 Ar 2B 
+---4+0-A?BP-240-AB?-141- B? (mod p) 
= AP + BP (mod p). U 


The formula 
(A+ B)? = A? + BP (mod p) 


is one of the most important formulas in all of number theory. It says that the 
p™ power of a sum is congruent to the sum of the p™ powers. We conclude by 
using this formula to give a new proof of Fermat’s Little Theorem. You should 
compare this proof with the one that we gave in Chapter 9. Each proof reveals 
different aspects of the underlying formula. Which proof do you like best? 


Theorem 38.4 (Fermat’s Little Theorem). Let p be a prime number, and let a be 
any number with a = 0 (mod p). Then 


a?-' = 1 (mod p). 
Proof by Induction. We start by using induction to prove that the formula 
a? = a (mod p) 


is true for all numbers a. This formula is clearly true for a = 0, which gets our 
induction started. Next suppose that we know it is true for some particular value 
of a. Then 
(a+ 1)? =a? + 1” (mod p) using the Binomial Theorem Modulo p 
with A = aand B = 1, 
=a+1 (mod p) since a? = a (mod p) 


by the induction hypothesis. 
This completes the proof by induction of the formula 


a? = a (mod p). 
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This means that p divides a? — a, so 
p divides a(a?~! — 1). 
Since p does not divide a by assumption, we conclude that 
a?-! = 1 (mod p), 


which completes the proof of Fermat’s Little Theorem. L] 


Exercises 


38.1. Compute each of the following binomial coefficients. 


10 20 15 300 
@(S) »() © (3) @ Gy) 


38.2. Use the formula 3) = CE to prove the addition formula 


(27) G) . ia) 


38.3. What is the value obtained if we sum a row 


(c) +) +0) *G) +7) +69) 


of Pascal’s Triangle? Compute some values, formulate a conjecture, and prove that your 
conjecture is correct. 


38.4. If we use the formula 


(") n(n — HEU EHGELEA) 


to define the binomial coefficient (Os then the binomial coefficient makes sense for any 


value of n as long as & is a nonnegative integer. 
(a) Find a simple formula for Ge) and prove that your formula is correct. 


(b) Find a formula for (ea, a and prove that your formula is correct. 


38.5. This exercise presupposes some knowledge of calculus. If n is a positive integer, 
then putting A = 1 and B = z in the formula for (A + B)” gives 


oar m(a) + ()o Gere (g)eree+ Gets Ge 
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In the previous exercise we noted that the binomial coefficient (c) makes sense even if n 
is not a positive integer. Assuming that n is not a positive integer, prove that the infinite 


series 
ae eee bee Lae eee 
0 1 2 3) 
converges to the value (1 + x)” provided that x satisfies |x| < 1. 


38.6. We proved that if p is a prime number and if 1 < k < p — 1, then the binomial 
coefficient (?) is divisible by p. 
(a) Find an example of integers n and k with 1 < k <n —1 and ( not divisible by n. 


(b) For each composite number n = 4, 6, 8, 10, 12, and 14, compute (7) modulo n for 
each 1 < k < n — 1 and pick out the ones that are 0 modulo n. 

(c) Use your data from (b) to make a conjecture as to when the binomial coefficient (3) 
is divisible by n. 

(d) Prove that your conjecture in (c) is correct. 


38.7. (a) Compute the value of the quantity 


& A ‘) (mod p) 


for a selection of prime numbers p and integers 0 < k < p—1, and make a conjecture 
as to its value. Prove that your conjecture is correct. 
(b) Find a similar formula for the value of 


( : *) (mod p). 


38.8. We proved that (A + B)? = A? + B? (mod p). 
(a) Generalize this result to a sum of n numbers. That is, prove that 


(Ay + Ag +Azg+---+An)? = AP + AS + A3+---+A® (mod p). 
(b) Is the corresponding multiplication formula true, 
(Ay -Ag-Ag--:An)? = Al - AS - AR--- A? (mod p)? 


Either prove that it is true or give a counterexample. 


Chapter 39 


Fibonacci’s Rabbits and 
Linear Recurrence Sequences 


In 1202 Leonardo of Pisa (also known as Leonardo Fibonacci) published his Liber 
Abbaci, a highly influential book of practical mathematics. In this book Leonardo 
introduced the elegant Hindu/Arabic numerical system (the digits 1,2,...,9 anda 
symbol/placeholder for 0) to Europeans who were still laboring under the handicap 
of Roman numerals. Leonardo’s book also contains the following curious Rabbit 
Problem. 


In the first month, start with a pair of baby rabbits. One month later 
they have grown up. The following month the pair of grown rabbits 
produce a pair of babies, so now we have one pair of grown rabbits and 
one pair of baby rabbits. Each month thereafter, each pair of grown 
rabbits produces a new pair of babies, and every pair of baby rabbits 
grows up. How many pairs of rabbits will there be at the end of one 
year? 


The first few months of rabbit procreation are illustrated in Figure 39.1, where 
each bunny image in Figure 39.1 represents a pair of rabbits. If we let 


F, = Number of pairs of rabbits after n months, 


and if we remember that each month the baby pairs grow up and that each month 
the grown pairs produce new baby pairs, we can compute the number of pairs of 
rabbits (baby and grown) in each subsequent month. Thus F; = 1 (one baby pair) 
and fF’ = 1 (one grown pair) and F3 = 2 (one grown pair plus a new baby pair) and 
F’4 = 3 (two grown pairs plus a new baby pair). Continuing with this computation, 
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Month 
i ic 


Figure 39.1: Fibbonaci’s Rabbits (each rabbit image represents a pair) 


we find that 


F, = 0 Grown Pairs 
Fy = 1 Grown Pair 
F3 = 1 Grown Pair 
Fy, = 2 Grown Pairs 
Fs, = 3 Grown Pairs 
Fg = 5 Grown Pairs 
Fy, = 8 Grown Pairs 
Fg = 13 Grown Pairs 
Fg = 21 Grown Pairs 
F\9 = 34 Grown Pairs 
Fi, = 55 Grown Pairs 
F\2 = 89 Grown Pairs 


+ 1 Baby Pair 
+0 Baby Pairs = 
+1 Baby Pair = 
+1 Baby Pair = 
+ 2 Baby Pairs = 
+ 3 Baby Pairs = 
+ 5 Baby Pairs = 
+8 Baby Pairs = 
+ 13 Baby Pairs = 
+ 21 Baby Pairs = 
+ 34 Baby Pairs = 


1 Pair 
1 Pair 
2 Pairs 
3 Pairs 
5 Pairs 
8 Pairs 
13 Pairs 
21 Pairs 
34 Pairs 
55 Pairs 
89 Pairs 


+ 55 Baby Pairs = 144 Pairs 
Fi3 = 144 Grown Pairs + 89 Baby Pairs = 233 Pairs. 
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This answers Fibonacci’s question. At the end of the year (after the 12" month is 
completed) there are 233 pairs of rabbits. The Fibonacci sequence of numbers 


AR ee Pee ane es 


arising from Fibonacci’s Rabbit Problem has intrigued people from the thirteenth 
century up to the present day.! 

Suppose that we want to extend our list of Fibonacci numbers F;, beyond the 
12" month. Looking at our list, we see that each Fibonacci number is simply the 
sum of the previous two Fibonacci numbers. In symbols, this becomes the formula 


Fy = Fn—-1+ Fr-2. 


Notice that this isn’t really a formula for F;,, because it doesn’t directly give the 
value of F;,. Instead it gives a rule telling us how to compute the n“ Fibonacci 
number from the previous numbers. The fancy mathematical word for this sort of 
tule is a recursion or a recursive formula. 

We can use the recursive formula for F;, to create a table of values. 


F 
1 1 10,946 
2 1 17711 
5 5 75,025 

| 6 8 | | 121,393 

| als | 196,418 
Seok 317,811 

514,229 
10 | 55 832,040 


The Fibonacci Numbers fF, 


The Fibonacci numbers appear to grow very rapidly. Indeed, the 31°t Fibonacci 
number is already larger than 1 million, 


F314 = 1,346,269; 


‘There is even a journal called the Fibonacci Quarterly that was started in 1962 and is devoted to 
Fibonacci’s sequence and its generalizations. 
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and in 45 months (less than 4 years), 
F'45 = 1,134,903,170, 


and we have more than 1 billion pairs of rabbits! Now look how large the numbers 
become before we reach even the 200" Fibonacci number: 

Feo = 1,548,008,755,920 

Fv = 1,304,969,544,928,657 

Fgg = 1,100,087,778,366,101,931 

Fio3 = 1,500,520,536,206,896,083,277 

F147 = 1,264,937,032,042,997,393,488 322 

F131 = 1,066,340,417,491,710,595,814,572,169 

Fyag = 1,454,489,111,232,772,683,678,306,641,953 

Figo = 1,226,132,595,394,188,293,000,174,702,095,995 

Fiz4 = 1,033,628 ,323,428,189,498 226 463,595,560,281 832 

Figg = 1,409,869,790,947,669,143,312,035,591,975,596,518,914. 

Number theory is all about patterns, but how can we possibly find a pattern in 

numbers that grow so rapidly? One thing we can do is try to discover just how 
fast the Fibonacci numbers are growing. For example, how much larger than its 


predecessor is each successive Fibonacci number? This is measured by the ratio 
F,,/Fn—1, 80 we compute the first few values. 


F3/F, = 2.00000 Fi1/Fio = 1.61818 
F4/F3 = 1.50000 Fi2/Fi1 = 1.61797 
Fs /F4 = 1.66666 F\3/F\2 = 1.61805 
F¢/Fs = 1.60000 F,4/F3 = 1.61802 
F,/Fs = 1.62500 \5/Fi4 = 1.61803 
F3/F_ = 1.61538 Fig/Fis = 1.61803 
Fo/Fs = 1.61904 F\7/Fig = 1.61803 
Fo/F9 = 1.61764 Fig/F\7 = 1.61803 


It looks like the ratio F;, /F’,_1 is getting closer and closer to some number around 
1.61803. It’s hard to guess exactly what number this is, so let’s see how we might 
figure it out. 

The last table suggests that F,, is approximately equal to aF,_; for some fixed 
number a whose value we don’t know. So we write 


F, © aFn-1, 
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where the squiggly equals sign means “approximately equal to.” The same reason- 
ing tells us that 
Fn-1 © aFp-2, 


and if we substitute this into F;, ~ aFy,-1, we get 
ya 6) a ae a’ F,-9. 


So we suspect that F;, ~ a’*F,,-9 and F,,_1 & aF,,_2. We also know the 
Fibonacci recursive equation f;,, = F,—1 + Fpn—2, so we find that 


a? F,-2 ~ aFy—-2 + Fy —2; 


Dividing by F;,,-2 and moving everything to one side yields the equation 


Se ah 


We know how to solve an equation like this: use the quadratic formula. 
a= Tvs or he 
cae. 2 
We were looking for the value of a, but we seem to have hit the jackpot and found 


two values! Both of these values satisfy the equation a? = a + 1, so for any num- 
ber n, they both satisfy the equation 


a” = q”?1 ee aQ?—2. 


This looks a lot like the Fibonacci recursive equation Ff, = Fy—1 + Pn—2. In 
other words, if we let G,, = a” for either of the values of a@ listed above, then 
Gn = Gn-1 a5 Gn_2. 

In fact, we can do even better by using both of the values, so we let a be the 
first value and ( be the second value, 


1475 
a= 


2 


We now consider the sequence 


H, = Aa” + BB", Tit Dea ste 


_i-v5 
eats 


and B 


It has the property 
Hyg ia 9 (AG eB Dae (Agta Bue) 
= A(a"“! an a3) 4 Bio? a gn) 
=Aa" + Be” 
= An, 
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so H,, satisfies the same recursive formula as the Fibonacci sequence, and we are 
free to choose the numbers A and B to have any values that we want. 

The idea now is to choose A and B so that the H,, sequence and the Fibonacci 
sequence start with the same two values. In other words, we want to choose A 
and B such that 


mi == 1 and Ham Fe = 1, 
This means we need to solve 
Aa+BB=1 and  Aa*+4+ BB? =1. 


(Remember that a and ( are specific numbers.) These two equations are easy to 
solve. We use a? = a + 1 and 6? = 8 + 1 to rewrite the second equation as 


A(a+1)+ B(84+1)=1. 
Subtracting the first equation from this gives 
A+B=0, so B=-A. 
Substituting B = —A into the first equation gives 
Aa — AZ = 1, 


which lets us solve for 
A=1/(a— f) =1/V5. 
Also B = —A = —1/ /5, which gives us the formula 
Hn = (a” — B”)/V5. 


The culmination of our calculations is the following beautiful formula for the 
n™ term of the Fibonacci sequence. It is named after Binet, who published it 
in 1843, although the formula was known to Euler and to Daniel Bernoulli at least 
100 years earlier. 


Theorem 39.1 (Binet’s Formula). The Fibonacci sequence F’, is the sequence de- 
scribed by the recursion 


ey alo and LoS Bee Peo fOr = 3,4, Ores, 


Then the n term of the Fibonacci sequence is given by the formula 


noe) -(9)} 


[Chap. 39] Fibonacci’s Rabbits and Linear Recurrence Sequences 330 


Proof. For each number n = 1,2,3,..., let H,, be the number 


tna { (94) - (524)} 


We will prove by induction on n that H,, = F,, for every number n. 
First we check that 


1 1+75 tas 1 
noalS4)-G9) ae 


a7 (2444) - (54) 
= ae 2 2 


- 3,{ os - th 1 4v5_, 
BaPeS eve 1, 


and 


“i rl ri = 


V5 
This shows that H, = F and Ha = Fo. 
Now suppose that n > 3 and that H; = F; for every value of 7 between 1 
and n — 1. In particular, H,_1 = Fy_1 and Hn-2 = Fyn_2. We need to prove that 
Hy, = F,,. But we have already checked that 


Ay, = Hn-1 + An_—2, 
and we know from the definition of the Fibonacci sequence that 
Py = Pot pe, 


so we see that H, = F,,. This completes our induction proof that H,, = F;, for 
every value of n. L] 


Historical Interlude. The number 


1475 
9 


= LOIS03 s0. 


is called the Golden Ratio (or the Divine Proportion) and is often attributed to the 
ancient Greeks, who assigned it the far less euphonious name of the “extreme and 
mean ratio.’ Various authors have attributed aesthetic merit to artistic composi- 
tions built on the divine proportion. For example, it has been suggested that the 
Parthenon (Figure 39.2) was designed so that its exterior dimensions are in the 
golden ratio. Here is a small rectangle[ __] whose sides are in the golden ratio, and 


here is a larger divinely proportioned rectangle . Do you find the propor- 


tions of these rectangles to be especially pleasing to the eye? 
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Figure 39.2: The Parthenon 


The Fibonacci sequence is an example of a Linear Recurrence Sequence. The 
word linear in this context means that the n™ term of the sequence is a linear 
combination of some of the previous terms. Here are examples of some other 
linear recurrence sequences: 


Ay = 3An_-1 + 10An_2 Ayal Ap 5 
The method that we used to derive Binet’s Formula for the n*" Fibonacci number 
can be used, mutatis mutandis,” to find a formula for the n™ term of any linear 
recurrence sequence. Of course, not all recurrence sequences are linear. Here are 
some examples of recurrence sequences that are not linear: 
En = En-1En-2 + En-3 Fi, =1 Fo = 2 E3=1 

In general, there is no simple expression for the n™ term of a nonlinear recurrence 
sequence. This does not mean that nonlinear sequences are uninteresting, quite the 


contrary is true, but it does mean that they are much harder to analyze than linear 
recurrence sequences. 


The Fibonacci Sequence Modulo m 


What happens to the numbers in the Fibonacci sequence if we reduce them mod- 
ulo m? There are only finitely many different numbers modulo m, so the values 
do not get larger and larger. As always, we start by computing some examples. 


“A useful Latin phrase meaning “the necessary changes having been made.” The implication, of 
course, is that the necessary changes are relatively minor. 
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Here’s what the Fibonacci sequence modulo m looks like for the first few val- 
ues of m. 

( ) ALLO 01.460 )15150;< 
He eo 112,022 10160. 2,0,.2.251. 

#, tmiod 4) 11 2.3.10, 1.2,3, 1:04 12.:., 
Prt ) 1,1,2,3,0,3,3,1,4,0,4,4,3,2,0,2,2,4,1,0,1,1,2... 

( ) -4,1,2;3,5,2,1;3,4,1,5,0,5;5,4,3,1,4,5,3,2,5;1,0,1,1,2,3.... 
Notice in each case that the Fibonacci sequence eventually starts to repeat. In other 
words, when we compute the Fibonacci sequence modulo m, we eventually find 
two consecutive 1’s appearing, and as soon this happens, the sequence repeats. (We 


leave as an exercise for you to prove that this always happens.) Thus there is an 
integer N > 1 such that 


Fnin = F, (mod m) for alli = 1525253. 


The smallest such integer N is called the period of the Fibonacci sequence mod- 
ulo m. We denote it by N(m). The preceding examples give us the following short 
table: 


The period of the Fibonacci sequence modulo m exhibits many interesting pat- 
terns, but our brief table is much too short to use in making conjectures. For now 
we concentrate on the case that the modulus is a prime p. Table 39.1 lists the pe- 
riod N(p) for all primes p < 229. Looking at the first two columns of the table, 
we immediately notice the five values 


N(11) =10, N(31)=30, N(41)=40, N(61)=60, N(71) =70, 


so we might be tempted to conjecture that if p = 1 (mod 10), then N(p) = p— 1. 
Unfortunately, this conjecture is not correct, since later entries in the table include 


N(101) =50, N(151)=50, N(181)=90, and N(211) = 42. 


However, we observe that in all cases the period N (p) divides p — 1. This suggests 
that we look at the list of the primes p satisfying N(p) | p — 1, 


11, 19,29, 31, 41, 59, 61, 71,79, 89, 101, 109, 131, 139, 149, 
152179. 1811915199. 211 909: 2 
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179) 178 
181°); 90 
191 | 190 
193; |, 388 
197, 396 
LOO | 622 
2d e)| 2D 
223 | 448 
227 | 456 
229 | 114 


Table 39.1: The Period of the Fibonacci Sequence Modulo the Prime p 


The pattern is obvious. These are the primes that are congruent to 1 or 9 modulo 10, 
which is the same as the set of primes that are congruent to 1 or 4 modulo 5. So 
we are led to conjecture that 


p =1or4 (mod 5) => N(p) |p—-1. 


How might we prove this conjecture? One idea is to use Binet’s formula mod- 
ulo p, but Binet’s formula involves /5. However, if p is congruent to 1 or 4 mod- 
ulo p, then Quadratic Reciprocity tells us that 5 is a square modulo p, so we can 
find a number that plays the role of the square root of 5 modulo p. With these ideas 
in hand, we are ready to prove our conjecture. 


Theorem 39.2 (Fibonacci Sequence Modulo p Theorem). Let p be a prime that is 
congruent to either 1 or 4 modulo 5. Then the period of the Fibonacci sequence 
modulo p satisfies 

N(p) |p—-1. 


Proof. We are assuming that p = 1 or 4 (mod 5), so the Law of Quadratic Reci- 
procity (Theorem 22.1) tells us that 


8-6)-0-6)- 


Thus 5 is a quadratic residue modulo p, so we can find a number c with the property 
that c? = 5 (mod p). We will assume that c is odd, since if it isn’t odd, we can 
always use c + p instead. Further, c # 0 (mod p), so c has a mod p inverse, which 
we denote by c~!. In other words, c~! is a number satisfying cc~! = 1 (mod p). 
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We now define a sequence of numbers modulo p by the formula 


i 1+c\" 1—c\” 
neo ({ 5 ) -( 5 ) ) (mod p). 
(Notice that this is exactly Binet’s formula if we treat c as \/5.) Using the fact that 


c? = 5 (mod p), it is easy to check that 
J, = Jo =1 (mod p) and Jn = Jn-1+Jn-2 foralln > 3. 


Thus the sequence J,, has the same starting values and satisfies the same recursion 
as the Fibonacci sequence modulo p. It follows that 


Fi, = Jn (mod p) for all n > 1. 


To simplify notation, we let 


1 = 
ame ana oP c 


a 5 


and then 
F, =c}(U" —V") (mod p). 
In particular, we can use Fermat’s Little Theorem (Theorem 9.1) to deduce that 
Page (Ur Tee = Vetea”? (tod 9) 
= (Us -(UP-*)3 +. V*. (VP-1)2) (mod p) 
=c!(U* — V*) (mod p) 
= F; (mod p). 
Thus the Fibonacci sequence modulo p repeats every p — 1 steps. 
However, the definition of N(p) says that the sequence repeats every N(p) 


steps, and that N(p) is the smallest such value. We divide p — 1 by N(p) to get a 
quotient and remainder 


p-1=N(p)qt+r with 0 <r < N(p). 


Using the fact that the sequence repeats every p — 1 steps and that it also repeats 
every N(p) steps allows us to compute 


F; — Fis (p-1)j = Fis n(p)qj4rj = Fitrj (mod p). 


Thus the Fibonacci sequence also repeats every r steps. But r < N(p), and N(p) 
is the smallest possible positive period, so we must have r = 0. Hence 


p-1=N(p)q, 


which completes the proof that N (p) divides p — 1. a 
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This concludes our discussion of the period of the Fibonacci sequence mod- 
ulo m, but there are many other questions to ask and many more patterns to be dis- 
covered. For example, are there infinitely many primes satisfying N(p) = p — 1? 
This is not currently known! In Exercises 39.13—39.16 you will be asked to inves- 
tigate further the values of N(m) for both prime and composite values of m. 


Exercises 


39.1. (a) Look at a table of Fibonacci numbers and compare the values of F;,, and Fyn 
for various choices of m and n. Try to find a pattern. [Hint. Look for a divisibility 
pattern. ] 

(b) Prove that the pattern you found in (a) is true. 

(c) If gcd(m,n) = 1, try to find a stronger pattern involving the values of F,,, F,, and 
Hines 

(d) Is the pattern that you found in (c) still true if ged(m, n) 4 1? 

(e) Prove that the pattern you found in (c) is true. 


39.2. (a) Find as many square Fibonacci numbers as you can. Do you think that there are 
finitely many or infinitely many square Fibonacci numbers? 
(b) Find as many triangular Fibonacci numbers as you can. Do you think there are finitely 
many or infinitely many triangular Fibonacci numbers? 


39.3. (a) Make a list of Fibonacci numbers F’, that are prime. 
(b) Using your data, fill in the box to make an interesting conjecture: 


If F,, is prime, then n is aa 


[Hint. Actually, your conjecture should be that the statement is true with one excep- 
tion. ] 

(c) Does your conjecture in (b) work in the other direction? In other words, is the fol- 
lowing statement true, where the box is the same as in (b)? 


If n is , then F, is prime. 


(d) Prove that your conjecture in (b) is correct. 


39.4. The Fibonacci numbers satisfy many amazing identities. 

(a) Compute the quantity F?,, — F?_, for the first few integers n = 2,3,4,... and try 
to guess its value. [Hint. It is equal to a Fibonacci number.] Prove that your guess is 
correct. 

(b) Same question (and same hint!) for the quantity F?,, + Fe — F3_,. 

(c) Same question (and almost the same hint) for the quantity F?,. — F?_5. 

(d) Same question (but not the same hint!) for the quantity F,,_1F,+1 — Pe 

(e) Same question for 4F;, F;,-1 + | ee [Hint. Compare the value with the square of a 
Fibonacci number. ] 


[Chap. 39] Fibonacci’s Rabbits and Linear Recurrence Sequences 336 


(f) Same question for the quantity Ft, ,—4F0,,—19Ft,,—-4Ft,,4+ Fi. 
39.5. A Markoff triple is a solution (2, y, z) in positive integers to the equation 
e+ y? +22 = 32yz. 


(a) Prove that if (xo, yo, 20) is a Markoff triple, then so is (29, yo, 3020 — Yo). 
(b) Prove that (1, F,_1, F2441) is a Markoff triple for all k > 1. 
(See Exercises 30.2 and 30.3 for other properties of the Markoff equation.) 


39.6. The Lucas sequence is the sequence of numbers L,, given by the rules ZL; = 1, 
Le = 5a and ER — Liga a Lees: 
(a) Write down the first 10 terms of the Lucas sequence. 
(b) Find a simple formula for L,,, similar to Binet’s Formula for the Fibonacci num- 
ber .F;,. 
(c) Compute the value of L? — 5F? for each 1 < n < 10. Make a conjecture about this 
value. Prove that your conjecture is correct. 
(d) Show that L3,, and F3,, are even for all values of n. Combining this fact with the 
formula you discovered in (c), find an interesting equation satisfied by the pair of 
numbers ($Lan, 5 Fn). Relate your answer to the material in Chapters 32 and 34. 


39.7. Write down the first few terms for each of the following linear recursion sequences, 
and then find a formula for the n“ term similar to Binet’s formula for the n Fibonacci 
number. Be sure to check that your formula is correct for the first few values. 

(a) A, = 3Ayn_-14+ 10An_2 Ave Ao =3 

(b) B, =2Bn-1 —4By~-2 B, =0 Bo =-2 

(c) Se, = AC, -1 = Cees = 6Cn_-3 Ch =, 0) Co =) C3 =a | 
[Hint. For (b), you'll need to use complex numbers. For (c), the cubic polynomial has 
some small integer roots.] 


39.8. Let P,, be the linear recursion sequence defined by 
Pry = Pa-1 + 4Pn—2 — 4Pr-s; P=1, P,o=9, P3=1. 


(a) Write down the first 10 terms of P,,. 

(b) Does the sequence behave in a strange manner? 

(c) Find a formula for P,, that is similar to Binet’s formula. Does your formula for P,, 
explain the strange behavior that you noted in (b)? 


39.9. (This question requires some elementary calculus.) 
(a) Compute the value of the limit 
log (Fy, 
jim 18a) 
nN—- oo n 
Here F,, is the n™ Fibonacci number. 
(b) Compute lim,_,..(log(An))/n, where A, is the sequence in Exercise 39.7(a). 
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(c) Compute lim,_,..(log(|Bn|))/n, where B,, is the sequence in Exercise 39.7(b). 
(d) Compute lim,_,..(log(C;,))/n, where C7, is the sequence in Exercise 39.7(c). 


39.10. Write down the first few terms for each of the following nonlinear recursion se- 
quences. Can you find a simple formula for the n‘" term? Can you find any patterns in the 
list of terms? 

(a) Dy =Dpat+ Des D,;=1 Dz=1 

(b) En = En-1En-2+ En-3 ki, =1 Eg =2 E3 =1 


39.11. Prove that the Fibonacci sequence modulo m eventually repeats with two con- 
secutive 1’s. [Hint. The Fibonacci recursion can also be used backwards. Thus if you 
know the values of F;, and F;,,1, then you can recover the value of F;,_; using the for- 
mula = Par — Fl 


39.12. Let N = N(m) be the period of Fibonacci sequence modulo m. 
(a) What is the value of Fy modulo m? What is the value of F'y—; modulo m? 
(b) Write out the Fibonacci sequence modulo m in the reverse direction, 


Fy-1, Fnw-2, Fn-3, .-. F3, Fa, F, (modm). 


Do this for several values of m, and try to find a pattern. [Hint. The pattern will be 
more evident if you take some of the values modulo m to lie between —m and —1, 
instead of between 1 and m.] 

(c) Prove that the pattern you found in (b) is correct. 


39.13. The material in Table 39.2 suggests that if m > 3 then the period N(m) of the 
Fibonacci sequence modulo m is always an even number. Prove that this is true, or find a 
counterexample. 


39.14. Let N(m) be the period of the Fibonacci sequence modulo m. 

(a) Use Table 39.2 to compare the values of N(m), N(mz2), and N(m1mz2) for various 
values of m, and mg, especially for ged(m ,, m2) = 1. 

(b) Make a conjecture relating N(m,), N(mz2), and N(m m2) when m and mz satisfy 
ged(m 1, m2) = 1. 

(c) Use your conjecture from (b) to guess the values of N (5184) and N (6887). [Hint. 
6887 = 71 - 97.] 

(d) Prove that your conjecture in (b) is correct. 


39.15. Let N(m) be the period of the Fibonacci sequence modulo m. 

(a) Use Table 39.2 to compare the values of N(p) and N(p) for various primes p. 

(b) Make a conjecture relating the values of N(p) and N(p) when p is a prime. 

(c) More generally, make a conjecture relating the value of N(p) to the values of all the 
higher powers N(p?), N(p?), N(p’*)... . 

(d) Use your conjectures from (b) and (c) to guess the values of N (2209), N (1024), and 
N (729). [Hint. 2209 = 472. You can factor 1024 and 729 yourself!] 

(e) Try to prove your conjectures in (b) and/or (c). 
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Table 39.2: The Period N(m) of the Fibonacci Sequence Modulo m 


39.16. Let N(m) be the period of the Fibonacci sequence modulo m. In the text we 
analyzed N(p) when p is a prime satisfying p = 1 or 4 modulo 5. This exercise asks you 
to consider the other primes. 
(a) Use Table 39.1 on page 333 to make a list of the periods N(p) of the Fibonacci 
sequence modulo p when p is a prime number satisfying p = 2 or 3 modulo 5. 
(b) If p = 1 or 4 modulo 5, we proved that N(p) divides p — 1. Formulate a similar 
conjecture for the primes that satisfy p = 2 or 3 modulo 5. 
(c) Try to prove your conjecture in (b). (This is probably hard using only the tools that 
you currently know.) 
(d) The one prime that we have not considered is p = 5. For various values of c, look at 
the sequence 
fee md 5) ls 258, en 


and compare it with the Fibonacci sequence modulo 5. Make a conjecture, and then 
prove that your conjecture is correct. 


Chapter 40 


Oh, What a Beautiful Function 


A long time ago! we found a formula for the sum of the first n integers: 


14+24+3+4+---4+(n-—1)+n = —— = =n’ + =n. 


This is a very beautiful and completely accurate formula, but there might be sit- 
uations where we’d prefer a formula that is less complicated, even at the cost of 
losing some accuracy. Thus we can say that 1+ 2+.---+ 7 1s approximately 
equal to sn’, since when n is large, the sn? term is much larger than the sn term. 

Similarly, there is an exact formula for the sum of the first n squares that we 
proved in Chapter 26, 


ey meee eh pyar ae n(n + 1)(2n 4+ 1) 
—— 
If we multiply out the nght-hand side, this becomes 


Tey Neer Cee) ety eee eee ees 
3 2 6 

The an? term is much larger than the other terms when n is large, so we can say 
that 1? + 2? + --- +n? is approximately equal to n°, and if we want to be more 
precise, we can say that the difference between 17 + 27+ ---+n? and gn? 1S 
more or less a multiple of n7. 

Approximate formulas of this sort appear quite frequently in number theory, as 
well as in other areas of mathematics and computer science. They take the form 


A bound for the 


Complicated Simple 
: = : + | size of the error 
function of n function of n ; 
in terms of n 


"A long time ago in a chapter far, far away... 
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For example, 


Error that is 
epeea eee ee en (aera geted ae — aes je not much 
al larger than n7 


simple 
function of n 


complicated function of n 


The mathematical way to write this approximate formula is with “big-Oh” no- 
tation. Using big-Oh notation, the previous formula is written in the form 


1 
SO sce eto lg] \a4 ne = 3n° + O(n’). 
Informally, this means that the difference between 17 + 2? + --- +n? and gn’ 1S 


smaller than some fixed multiple of n?. 

The formal definition of big-Oh notation is somewhat abstract and can be con- 
fusing at first. But if you keep the 17 + 27 + --- + ? example in mind, you will 
find that big-Oh notation is not that complicated and, after some practice, it be- 
comes very natural. 


Definition. Suppose that f(n), g(n), and h(n) are functions. The formula 
f(n) = g(n) + O(A(n)) 
means that there is a constant C’ and a starting value no such that 
|f(n) — g(n)| < ClA(n)| for all n > no. 


In words, the difference between f(n) and g(n) is no larger than a constant mul- 
tiple of h(n). When reading the formula f(n) = g(n) + O(A(n)) aloud, we say 
that 

“f(n) equals g(n) plus big-Oh of h(n).” 


Sometimes the function g(7) is absent, which is the same as saying that g(n) = 
0 for all n, so the formula 


f(n) = O(A(n)) 
means that there is a constant C’ and a starting value no such that 


|f(n)| < Clh(n)| for all n > no. 


For example, 


n3 = O(2"), 
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since” 


a <2" for all n > 10. 


It is also very common to have formulas with h(n) equal to the constant func- 
tion h(n) = 1. A common mistake is to believe that the formula 


means that f(n) is itself constant. Nothing could be further from the truth. The 
formula f(n) = O(1) means that | f(7)| is smaller than a constant C’. For example, 


the function f(n) = — 


5 is certainly not constant, but it is true that 


2n+3 
ee = ()(1 
n+2 (1), 


since 
2n+3 


n+2 


<2, for alln > 1. 


The Fibonacci sequence 


Pa Ot heel ae acs 


provides another opportunity to use big-Oh notation. In Chapter 39 we proved 
Binet’s beautiful formula for the n' Fibonacci number, 


n= J_{ (4%) - (.=98)"| 


The two quantities appearing in this formula have the values 


1 oan 
: dl 4618039... and a 


= —0.618039.. «. 


When we take 15 and raise it to a large power, we get something that is very 


1+V5 


small, while 3 raised to a large power is very large. So an approximate, but 
still useful, version of Binet’s formula says that 


1 f/14+75 
wei 2 


*Note that there are lots of possible choices for C and no. For example, we could say that 


n® = O(2") since n®? < 10-2” for all n > 1. But we cannot say that n® = O(n?), since there is 


no choice of C' that makes n® smaller than Cn? when n is large. 


F, = 


+ 0(0.61304"). 
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In this case, the error term O(0.613046") actually approaches zero extremely 
rapidly as n gets larger and larger. This contrasts with the big-Oh formula for the 
sum 12 + 2? 4+... + n?, where the error O(n?) got larger as n increased, albeit at 
a slower rate than the main term n°. 

There are many methods for discovering and for proving big-Oh formulas. One 
of the most powerful uses geometry and a little bit of calculus. We demonstrate this 


geometric method by finding a big-Oh formula for the sum 
1F+o%4...4n%. 
We already know that 
1 5 2; 92 2_1 3 2 
UA Pest saree ol + O(n) and If 2" e+ 8 =n +O(n*), 
so we might guess that 
1° 4 OF 4... pyt & ——— 


This is indeed the case. 
Theorem 40.1. Fix a power k > 1. Then 


k 1 


ay a EN k 
=F + O(n"). 


WF 42% 4 3F4...4n 
Proof. Let S(n) = 1* + 2* + .-. + n* denote the sum that we are trying to esti- 
mate. We draw a bunch of rectangles. The first rectangle has base 1 and height 1”, 
the second rectangle has base 1 and height 2", the third rectangle has base 1 and 
height 3”, and so on. Placing these rectangles side-by-side, we get the picture il- 
lustrated in Figure 40.1.7 Notice that if we sum the areas of all of the rectangles, 
we get precisely the quantity S(n). 

Rather than exactly computing the area inside the rectangles, we approximate 
the total rectangle area by instead computing the area of a simpler region. If we 
draw the curve y = x", then, as you can see in Figure 40.2, the rectangles fit fairly 
snugly under the curve. And since the rectangles in Figure 40.2 lie underneath the 
curve y = x", we know that the area inside the rectangles is smaller than the area 
under the curve. 


3Tn order to fit the diagram on the page, we have not drawn the rectangles in Figure 40.1 to be 
the correct size, so you should use the picture only as an aid in gaining understanding of the general 
idea. Feel free to draw your own pictures to the proper scale, say with k = 2, but be prepared to use 
a large piece of paper! 
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Figure 40.1: Rectangles Whose Total Area Is 1* + 2* + ---+n* 


In other words, from the picture we have deduced that 


Area of ) Ge under the curve y = .) 


k ie ae es 
Bee mat Gene withl<x<n+l1 


We can use basic calculus to compute the area under the curve. 


Area under the curve y = x* ies 
= eas 
withl<x<n-+l1 l 


Hed as l k+1 
ae eh ae 
This gives us an upper bound 
1 
1* ok ere. k < 1 k+1 
+2°+---+n i y(n + ) 


(We have dropped the —1 on the right-hand side, since leaving it in gives only a 
slightly stronger estimate.) 

Similarly, if we slide the rectangles over to the left one unit, then, as illustrated 
in Figure 40.3, the rectangles completely cover the area under the curve y = x* 
between 0 < x < n. This means that the area of the rectangles is larger than the 
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Figure 40.2: Area Under Curve Is Larger Than Area of Rectangles 


area under this part of the curve, so we get a corresponding lower bound 


ae) eee ( a 


rectangles 


2 Area under the curve y = «* 
withO<a<n 


n 
-| a* dx 
0 


htt |" 
k+1|, 


oF 1 k+1 
TR | 


Putting together our upper and lower bounds, we have proved that 


1 
k+1 k k k 

ef 1F 9 feet nl 
a eles TT, 


(n +.1)**#, 


We subtract --;n**? to get 


1 
0< (1* +2" +...+n*) a 


= ra ((n Ev yet Ls itt). (x) 


<= 
k+1 


We need to show that the upper bound is not too large. To do this, we use the 
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Figure 40.3: Area Under Curve Is Smaller Than Area of Rectangles 


binomial expansion (Chapter 38) for (n + 1)**1, 


k+1 k+1 k+1 k+1 
RA k k-1, 
(n+1) = 7 +( 1 n +( 9 )n + +( k Jn ta 


The crucial point is that the largest term is n**1 and all the other terms involve 
smaller powers of n. Hence 


(n+ Ly yt 
spe DN ge at On ee eT k+1 k+1 
=( 1 )n +( 9 )n ate srt k n+ pare 
ot LN RELY k+1\ 5 ba LN, & 
= ae 
< (Uy Tate (5 ats + (Peo }nte GT) 


= (some mess involving k) - n*. 


We combine this estimate with our earlier inequality (*) to obtain 


1 
OPIS OP Ge eh) = we “ae < (some mess involving k) - n*. 
Of course, the “new mess” is ma times the “old mess,” but in any case, it only 
involves k and does not depend on n. This proves that 
1 


(14. SF. ces tinP) = ——$ 1 On), 


eee 
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and indeed it proves something a bit stronger, since it shows that the sum of the 


k'® powers 1* + 2 +... + n* is always strictly larger than eer’. This com- 


pletes the proof of the theorem. ea 


Describing How Long a Computation Will Take 


Big-Oh notation is frequently used to describe how long it takes to do a certain 
computation using a particular method. For example, suppose that we fix a num- 
ber a and a modulus m and that we want to compute the value of a” (mod m) for 
some large value of the exponent n. How long does it take us? 

One way to do the computation is to compute 


a, =a(modm), and then 
a2 =a-a;(modm), and then 
a3 =a-a2(modm), andthen... 

Eventually we get to a,,, which is equal to a” (mod m). We end up having to do n 
steps, where each step consists of one multiplication and one reduction modulo m. 
Assuming that each step takes a more or less fixed amount of time, the total time 
is a constant multiple of n. So the running time of this method is O(n). 

Of course, no one who has read Chapter 16 would ever use this absurdly ineffi- 
cient method to compute a” (mod m). The method of successive squaring allows 


us to compute a” (mod m) much more rapidly. As described in Chapter 16, the 
method of successive squares has three pieces: 


1. Write n as a sum of powers of 2 (the binary expansion) 
N=upt+uy:2+ug-4+ug-8+---+ up: 2", 
where u, = 1 and every wu; is either 0 or 1. 


2. Create a table of values 


Ap =a, A, = A2%(modm), A, = A? (mod m), 
A, = AZ (modm), ..., A, = A2_, (mod m). 


3. Compute the product 


Aj? « Ay? - AS? --- A? (mod m). (40.1) 
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Each of the three pieces takes approximately r steps, so the total time is a constant 
multiple of 7. Do you see how the number r is related to the exponent n? From the 
binary expansion (40.1), the number 7 is at least 2", so if we take logarithms, we 
see that? 


r < logo(n). 


Therefore, the method of successive squares allows us to compute a” (mod m) in 
time O(logs(n)). This is immensely faster than time O(n), since log.(n) is much 
smaller than n when n is large. 

The number log,(n) is the number of binary digits in n; that is, it’s the number 
of digits when we write n using binary notation. Similarly, log,)(n) is the number 
of decimal digits in n. So, roughly speaking, log(7) tells us each of the following 
pieces of information: 


e how much time it takes to write down the number 7; 
e how long it takes us to describe the number n to another person; 


e how long it takes to input the number n into a computer or to output the 
number n from a computer 


We can summarize this by saying that it takes time O(log()) to describe the num- 
ber n. 

It is thus interesting and somewhat surprising that it only takes O(log(n)) mul- 
tiplications to compute the quantity a” (mod m), since we have seen that it al- 
ready takes time O(log(n)) simply to input the number n. The successive squar- 
ing method is said to take linear time because the number of multiplications is 
at most a constant multiple of the time it takes to input the initial information. 
[Of course, if m and n are about the same size, then each multiplication takes at 
least O(log(n)) steps, so the total time is at least O(log(n)?).] 

Let’s look at another problem, that of multiplying two polynomials of degree d, 


F(X) =ap+a,X +-+-+agX? and G(X) = bo + bX +--+ + bgX4. 


There are 2d + 2 coefficients, so it takes time O(d) to describe the polynomials.° 


“The function log, (zx) is the logarithm to the base 2. By definition, the value of log,(a) is the 
number y that is needed to make 2” = z. 

>Notice that the degree of a polynomial plays a role similar to that played by the logarithm of a 
number. Another property shared by the degree and the logarithm is illustrated by 


deg(F(X)G(X)) = deg(F(X)) + deg(G(X)) and log(MN) = log(M) + log(N). 


Thus both the degree and the logarithm convert multiplication into addition. 
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The product of F(X) and G(X) is given by the formula 


H(X) = F(X)G(X) = co +e. X +c9X? +--+ + cogX*4 
— i= ab; = a,bj_1 4 F298 = abo ue <3 <4. 
aj;—aba “+ a;—d+104-1 of whens agbj—a id <4 & 2a. 
If 0 < 7 < d, then computing c; requires 7 additions and 7 + 1 multiplications, 
so it takes time O(j) to compute c;. And ifd < j < 2d, then it takes time 


O(2d — j + 1) to compute c;. Hence the total time to compute the product H(X) 
is 


d 2d 2d 
SO) + S5 Od-§) -o(Si+ Yo @d-4+1)) 
j=0 ) 


j=d+l1 j=d+1 
@+id d*4+d 
= d? 
( oe ) Od) 


Thus the time to compute F(X )G(X) is big-Oh of the square of the amount of 
time it takes to input the initial data, so we say that it takes quadratic time to 
compute the product of two polyomials.° 


Exercises 
40.1. (a) Suppose that 
fi(n) = gi(n)+O(h(n)) and a(n) = ga(n) + O(h(n)). 
Prove that 
fi(n) + fa(m) = g1(n) + ga(n) + O(h(n)). 
(b) More generally, if a and b are any constants, prove that 
afi(n) + bfe(m) = agi(n) + bga(n) + O(A(n)). 


(Note that the constant C’ appearing in the definition of big-Oh notation is allowed 
to depend on the constants a and b. The only requirement is that there be one fixed 
value of C that works for all sufficiently large values of n.) 


°Of course, we really mean that this particular method of computing F'(X)G(X) takes quadratic 
time. There are other methods, such as Karatsuba multiplication and Fast Fourier Transforms, that 
are much faster. These fancier methods are able to multiply two polynomials in time O(d log d), 
so just slightly slower than linear time. The advantage of linear time over quadratic time is not too 
important if d is small, say d = 10 or d = 15, but if d = 10000, there is a considerable difference. 
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(c) The formula that you proved in part (b) shows that big-Oh formulas (with the same h) 
can be added, subtracted, and multiplied by constants. Is it also okay to multiply them 
by quantities that are not constant? In other words, if f(n) = g(n) + O(h(n)) and 
if k(n) is another function of n, is it true that 


k(n) f(n) = k(n)g(n) + O(A(n))? 


If not, how about 


40.2. Suppose that 
film) =gi(n)+O(hi(n)) and fa(m) = ga(n) + O(ha(n)). 
Prove that 
filn) + fa(m) = g1(n) + ga(n) + O(max{h1(n), ha(n)}). 
40.3. Which of the following functions are O(1)? Why? 


2 
@ fa=2t* & sm=S2t © re 
(d) f(n) = cos(n) (e) in) = sas Oy i) = a 


40.4. Find a big-Oh estimate for the sum of square roots; that is, fill in the boxes in the 
following formula: 


Vit va+V8++va=[ jn '+0(n-'), 


40.5. (a) Prove the following big-Oh estimate for the sum of the reciprocals of the inte- 


gers: 
Ss (ae DA | 1 

~+-—4+—4+24---+—-—=1 O(1). 

Ui geg ig aie ta n(n) + O(1) 


[Here In(z) is the natural logarithm of z.] 
(b) Prove the stronger statement that there is a constant y such that 


ede Nc eae 1 1 
1l+-+=+-4+-++-:-+-=1 —}. 
Pane gee 5 n(n) +7 +0 (=) 

The number 7, which is equal to 0.577215664..., is called Euler’s constant. Very 
little is known about Euler’s constant. For example, it is not known whether or not 
is a rational number. 


40.6. Bob and Alice play the following guessing game. Alice picks a number between 1 
and n. Bob starts guessing numbers and, after each guess, Alice tells him whether he is 
right or wrong. Let G(n) be the most guesses it can take Bob to guess Alice’s number, 
assuming that he uses the best possible strategy. 
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(a) Prove that G(n) = O(n). 

(b) Prove that G(n) is not O(,/n). 

(c) More generally, if G(n) = O(h(n)), what can you say about the function h(n)? 

(d) Suppose that we change the rules of the game so that, after Bob guesses a number, 
Alice tells him whether his guess is too high, too low, or exactly right. Describe 
a strategy for Bob so that his number of guesses before winning satisfies G(n) = 
O(log, (n)). [Hint. Eliminate half the remaining numbers with each guess. ] 


40.7. Bob knows that the number n is composite and he wants to find a nontrivial factor. 
He employs the following strategy: Check if 2 divides n, then check if 3 divides n, then 
check if 4 divides n, etc. Let F'(n) be the number of steps it takes until he finds a factor 
of n. 

(a) Prove that F(n) = O(n). 

(b) Suppose that, instead of checking every number 2, 3,4,5,6,..., Bob only checks 
if n is divisible by primes 2, 3,5,7,11,.... Explain why this strategy still works and 
show that the number of steps F'(n) now satisfies F'(n) = O( 5). [Hint. You’ll 
need to use the Prime Number Theorem (Theorem 13.1).] Do you think that this new 
strategy is actually practical? 

Faster methods are known for solving this problem, such as the Quadratic Sieve and 
the Elliptic Curve Method. The number of steps L(n) that these methods require 


satisfies 
L(n) 3) (ev inten in(n) , 


where c is a small constant. Prove that this is faster than the method in (a) by showing 


that 
ef\ /\n(n)-In In(n) 
n—00 Jn 
More generally, show that the limit is 0 even if the \/n in the denominator is replaced 
by n* for some (small) € > 0. 


The fastest known method to solve this problem for large numbers n is called the 
Number Field Sieve (NFS). The number of steps / (7) required by the NFS is 


M(n) = 0 (e¢ V@an)Gninay) | 


(c 


—_ 


(d 


—_ 


where again c’ is a small constant. Prove that for large values of n the function M(n) 
is much smaller than the big-O estimate for L(m) in (c). 


Big-Oh notation is so useful that mathematicians and computer scientists have devised 
similar notation to describe some other typical situations. In the next few exercises, we 
introduce some of these concepts and ask you to work out some examples. 


40.8. Small-oh Notation. Intuitively, the notation o(h(n)) indicates a quantity that is 
much smaller than h(n). The precise definition is that 


f(n) = g(n) + o(h(n)) means that lim f(r) = g(r) 


noo n) 


= 0. 
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(a) Prove that n!° = 0(2”). 

(b) Prove that 2” = o(n!). 

(c) Prove that n! = 0(2”’). 

(d) What does the formula f(n) = o0(1) mean? Which of the following functions 
are o(1)? 


@) f (n) = = Gi) f(n)=— (iii) f(n) =2"-"" 


40.9. Big-Omega Notation. Big-Omega notation is very similar to big-Oh notation, ex- 
cept that the inequality is reversed.’ In other words, 


f(n) = g(n) + Q(h(n)) 


means that there is a positive constant C’ and a starting value no such that 


sin(n) 


|f(n) al g(n)| > C\h(n)| for alln > no. 


Frequently g is zero, in which case f(n) = (Q(h(n)) means that |f(n)| > C|h(n)| for all 
sufficiently large values of n. 
(a) Prove that each of the following formulas is true. 


has 
Qn 


(b) If f(n) = Q(A(n)) and h(n) = Q(k(n)), prove that f(n) = O(k(n)). 

(c) If f(n) = Q(A(n)), is it then always true that h(n) = O(f(n))? 

(d) Let f(n) = n° — 3n? + 7. For what values of d is it true that f(n) = Q(n%)? 

(e) For what values of d is it true that /n = Q((log, n)2)? 

(f) Prove that the function f(n) = n- sin(n) does not satisfy f(n) = Q(/n). [Hint. 
Use Dirichlet’s Diophantine Approximation Theorem (Theorem 33.2) to find frac- 
tions p/q satisfying |p — 27q| < 1/q, let n = p, and use the fact that sin(x) + x 
when z is small.] 


Gi) n?-—n=Q(n) Gi) n! =(2") (iii) 


= 2(2") 


40.10. Big-Theta Notation. Big-Theta notation combines both big-Oh and big-Omega. 
One way to define big-Theta is to use the earlier definitions and say that 


f(n) = g(n) + O(h(n)) 
if both 


f(n) =g(n) + O(h(n)) and f(n) = g(n) + Q(A(n)). 


TWarning: Exercise 40.9 describes what 2 means to computer scientists. Mathematicians typi- 
cally assign a different meaning to 2. They take it to mean that there is a positive constant C’ and 
infinitely many values of n such that |f(n) — g(n)| = C|h(n)|. Notice the important distinction 
between a statement being true for all (large) values of n and merely being true for infinitely many 
values of n. 
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Or we can write everything out explicitly and define 
f(n) = g(n) + O(A(n)) 
to mean that there are positive constants C, and C2 and a starting value no such that 


Cy|h(n)| < |f(n) — g(n)| < Co|h(n)| for all n > no. 


n(ied)-0(0) 


[Hint. Use the Taylor series expansion of In(1 + t) to estimate its value when t is 
small. ] 


(b) Use (a) to prove that 


(a) Prove that 


aw 1 
In|n? — n? + 3| = 3ln(n) +0 (=) 
(c) Generalize (b) and prove that if f(a) is a polynomial of degree d then 
1 
log | f(n)| = dln(n) + © (=) ' 


(d) If fi(m) = gi(n) + O(A(n)) and fo(n) = go(n) + O(h(n)), prove that 
fin) = fa(n) = gi(n) + ga(n) + O(A(n)) 
(e) If f(n) = O(A(n)), is it then necessarily true that h(n) = O(f(n))? 


Chapter 41 


Cubic Curves and Elliptic 
Curves 


We have now studied solutions to several different sorts of polynomial equations, 
including 


a en Pythagorean Triples Equation (Chapters 2 and 3) 
gt + y* = 24 Fermat’s Equation of Degree 4 (Chapter 30) 
ge? = Dy? =1 Pell’s Equation (Chapters 32, 34, and 48 ) 


These are all examples of what are known as Diophantine Equations. A Diophan- 
tine equation is a polynomial equation in one or more variables for which we are to 
find solutions in either integers or rational numbers. For example, in Chapter 2 we 
showed that every solution in (relatively prime) integers to the Pythagorean triples 
equation is given by the formulas 

st a s7 +t? 


Va. eS 
re y > 9 


We reached a very different conclusion in Chapter 30 concerning Fermat’s equation 
of degree 4, where we showed that there were no solutions in integers with ryz # 
0. Pell’s equation, on the other hand, has infinitely many solutions in integers, and 
we showed in Chapter 34 that every solution can be obtained by taking a single 
basic solution and raising it to powers. 

In the next few chapters we discuss a new kind of Diophantine equation, one 
given by a polynomial of degree 3. We are especially interested in the rational 
number solutions, but we also discuss solutions in integers and solutions “mod- 
ulo p.” Diophantine equations of degree 2 are fairly well understood by mathe- 
maticians today, but equations of degree 3 already pose enough difficulties to be 
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topics of current research. Also, surprisingly, it is by using equations of degree 3 
that Andrew Wiles proved that Fermat’s equation 7” + y” = z” has no solutions 
in integers with xyz # 0 for all degrees n > 3. 

The degree 3 equations that we will study are called elliptic curves.' Elliptic 
curves are given by equations of the form 


y=2+anr?+brt+e. 
The numbers a, b, and c are fixed, and we are looking for pairs of numbers (z, y) 
that solve the equation. Here are three sample elliptic curves: 


EB, :y? = 2° +17, 
Eo: y=2° +a, 
E3:y? = 2° — 4x7 + 16. 


The graphs of £1, &2, and £3 are shown in Figure 41.1. We will return to these 
three examples many times in the ensuing chapters to illustrate the general theory. 

As already mentioned, we will be studying solutions in rational numbers, in 
integers, and modulo p. Each of our three examples has solutions in integers, for 
example 


EF has the solutions (—2, 3), (—1, 4), and (2,5), 
Ez has the solution (0, 0), 
E3 has the solutions (0, 4) and (4, 4). 


We found these solutions by trial and error. In other words, we plugged in small 
values for x and checked to see if x? + ax” + bx + c turned out to be a perfect 
square. Similarly, checking a few small rational values for x, we discover the ra- 
tional solution (1/4, 33/8) to £,. How might we go about creating more solutions? 

A principal theme of this chapter is the interplay between geometry and number 
theory. We’ve already seen this idea at work in Chapter 3, where we used the 
geometry of lines and circles to find Pythagorean triples. Briefly, in Chapter 3 we 
took a line through the point (—1, 0) on the unit circle and looked at the other point 
where the line intersected the circle. By taking lines whose slope was a rational 
number, we found that the second intersection point had z, y-coordinates that were 


‘Contrary to popular opinion, an elliptic curve is not an ellipse. You may recall that an ellipse 
looks like a squashed circle. This is not at all the shape of the elliptic curves illustrated in Figure 41.1. 
Elliptic curves first arose when mathematicians tried to compute the circumference of an ellipse, 
whence their somewhat unfortunate moniker. A more accurate, but less euphonious, name for elliptic 
curves is abelian varieties of dimension one. 
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FE, :y? =2° +17 Fo: y2 =a +2 E3:y? = 23 — 42? + 16 


Figure 41.1: Graphs of Three Representative Elliptic Curves 


rational numbers. In this way we used lines through the one point (—1, 0) to create 
lots of new points with rational coordinates. We want to use the same sort of 
method to find lots of points with rational coordinates on elliptic curves. 

Let’s try the exact same idea using the elliptic curve 


Ey: y? =2°4+17. 


We draw lines through the point P = (—2,3) and see what other points we find. 
For example, suppose we try the line with slope 1, 


y—3=2+2. 


To find the intersection of this line with £1, we substitute y = x + 5 into the equa- 
tion for £; and solve for x. Thus, 


y2 =z? +17 The equation for £1. 
(2 +5)? =2°417 Substitute in the equation of the line. 
0=2°-—2?- 102-8 Multiply out and combine terms. 


You probably don’t know how to find the roots of cubic polynomials,” but in this 
instance we already know one of the solutions. The elliptic curve /) and the line 


*There actually is a cubic formula, although it is considerably more complicated than its cousin, 
the quadratic formula. The first step in finding the roots of x*® + Ax? + Bx + C = 0 is to make the 
substitution z = t — A/3. After some work, the equation for ¢ looks like t? + pt +q = 0. A root of 
this equation is then given by Cardano’s formula 


t= §/—q/2+ Jq?/4+ p8/27 + §/—q/2 — JP /4 + BT. 
There is a yet more complicated quartic formula for the roots of fourth-degree polynomials, but that 
is where the story ends. In the early 1800s, Niels Abel and Evariste Galois showed that there are 
no similar formulas giving the roots of polynomials of degree 5 or greater. This result is one of 
the great triumphs of modern mathematics, and the tools that were developed to prove it are still of 
fundamental importance in algebra and number theory. 
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both go through the point P = (—2,3), soz = —2 must be a root. This allows us 
to factor the cubic polynomial as 


2° — 2" — 10x — 8 = (4 + 2)(x? — 3a — 4). 


Now we can use the quadratic formula to find the roots z = —1 and x = 4 of 
x* — 3a — 4. Substituting these values into the equation of the line y = x + 5 then 
gives the y-coordinates of our new points (—1, 4) and (4,9). You should check 
that these points do indeed satisfy the equation y? = x? + 17. 

This looks good, but before we become overconfident, we should try (at least) 
one more example. Suppose we take the line through P = (—2, 3) having slope 3. 
This line has equation 


3= 3(x a 2), 
which, after rearranging, becomes 
y=3zr+9. 


We substitute y = 3x + 9 into the equation for £; and compute. 


yo = 2° 417 The equation for £}. 
(3a+9)? =2°+17 Substitute y = 3x + 9. 
0 = a — 9a? — 54x — 64 Expand and combine terms. 


O = (x + 2)(x? — 11” — 32) Factor out the known root. 


Just as before, we can use the quadratic formula to find the roots of ee Ng 32: 
but unfortunately what we find are the two values 


11+ ¥249 
Lt = ——-—.- 


2 


This is obviously not the sort of answer we were hoping for, since we are looking 
for points on /; having rational coordinates. 

What causes the problem? Suppose that we draw the line L of slope m through 
the point P = (—2, 3) and find its intersection with E. The line L is given by the 
equation 

L:y-3=m(z+4 2). 


To find the intersection of L and FE, we substitute y = m(xz + 2) + 3 into the 
equation for £; and solve for z. When we do this, we get the following formidable- 
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looking cubic equation to solve: 
yo = 2° +17 
(m(a + 2) + 3)” =9°+17 
0 = 2° — m?x? — (4m? + 6m)a — (4m? + 12m — 8). 
Of course, we do know one root is x = —2, so the equation factors as 
O = (x + 2)(x? — (m? + 2) — (2m? + 6m — 4)). 


Unfortunately for our plans, the other two roots are unlikely to be rational numbers. 

It looks like the idea of using lines through known points to produce new points 
has hit a brick wall. As is so often the case in mathematics (and in life?), stepping 
back and taking a slightly wider view reveals a way to clamber over, squeeze under, 
or just plain walk around the wall. In this case, our problem is that we have a cubic 
polynomial, and we know that one of the roots is a rational number; but this leaves 
the other two roots as solutions of a quadratic polynomial whose roots may not 
be rational. How can we compel that quadratic polynomial to have rational roots? 
Harking back to our work in Chapter 3, we see that if a quadratic polynomial has 
one rational root then the other root will also be rational. In other words, we really 
want to force the original cubic polynomial to have two rational roots, and then the 
third one will be rational, too. 

This brings us to the crux of the problem. The original cubic polynomial had 
one rational root because we chose a line going through the point P = (—2, 3), 
thereby ensuring that 7 = —2 is a root. To force the cubic polynomial to have 
two rational roots, we should choose a line that already goes through two rational 
points on the elliptic curve F. 

An example illustrates this idea. We start with the two points P = (—2, 3) and 
Q = (2,5) on the elliptic curve 


Bigs ae + Ie 
The line connecting P and Q has slope (5 — 3) /(2 — (—2)) = 1/2, so its equation 
iS 
it 
VK 5% +4. 
Substituting this into the equation for £ gives 


yo =a +17 
2 
J 3 
(5e+4) =a2°+17 


1 
Osa {y = tek. 
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This must have x = —2 and x = 2 as roots, so it factors as 


Oe =a) 9) (2-7). 


Notice that the third root is indeed a rational number, x = 1/4, and substituting 
this value into the equation of the line gives the corresponding y-coordinate, y = 
33/8. In summary, by taking the line through the two known solutions (—2, 3) and 
(2,5), we have found the rational solution (1/4, 33/8) for our elliptic curve. This 
procedure is illustrated in Figure 41.2. 


(-2, 3) 


(1/4, 33/8) 


Figure 41.2: Using Two Known Points to Find a New Point 


Suppose that we try to repeat this procedure with the new solution (1/4, 33/8). 
If we draw the line through (—2,3) and (1/4, 33/8), say, we know that the third 
intersection point with EF, is the point (2,5). So we end up back where we started. 
Again it seems we are stuck, but again a simple observation sets us moving again. 
This simple observation is that if (x, y) is a point on the elliptic curve F; then the 
point (2, —y) is also a point on E,. This is clear from the symmetry of EF about 
the x-axis (see Figure 41.2). So what we do is take the new point (1/4, 33/8), 
replace it with (1/4, —33/8), and then repeat the above procedure using the line 
through (1/4, —33/8) and (—2, 3). This line has slope —19/6 and is given by the 
equation y = —192/6 — 10/3. Substituting into the equation for F,, we end up 
having to find the roots of 

pee ee ae 
36 9 9 

Two of the roots are 1/4 and —2, so we can divide this cubic polynomial by 
(x — 1/4)(a + 2) to find the other root, 


361 190 53 il 106 
3 2 
ge ee green ype ee Be 
36 ae ( i) ( >) 
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This gives x = 106/9, and substituting this value of x into the equation of the 
line gives y = —1097/27. So we have found a new point (106/9, —1097/27) 
satisfying the equation 

E, :y? = 2? +17. 


Continuing in this fashion, we find lots and lots of points. In fact, just as 
with Pell’s equation, we get infinitely many points with rational coordinates. For 
Pell’s equation, we showed that all solutions can be obtained by taking powers 
of a single smallest solution. It turns out that every point on /; with rational 
coordinates can be found by starting with the two points P and Q, connecting 
them by a line to find a new point, reflecting about the x-axis, drawing more lines 
through the known points to find new points, reflecting again, and repeating the 
process over and over. The important point to observe here is that every point on 
£, with rational coordinates can be obtained by starting with just two points and 
repeatedly applying a simple geometric procedure, just as every solution to Pell’s 
equation was obtained by starting with one basic solution and repeatedly applying 
a simple rule. The fact that the infinitely many rational solutions to &, can be 
created from a finite generating set is a special case of a famous theorem. 


Theorem 41.1 (Mordell’s Theorem). (L.J. Mordell, 1922) Let E be an elliptic 
curve given by the equation 


E:y?=2° +ax*+bet+e, 
where a, b,c are integers such that the discriminant 
A(E) = —4a3c + a*b* — 4b? — 27c? + 18abe 
is not zero.> Then there is a finite list of solutions 
P, = (21,91), Po = (#9, ya), .-.; Pr = (Br Hr), 


with rational coordinates such that every rational solution to E can be obtained 
starting from these r points and repeatedly taking lines through pairs of points, 
intersecting with E, and reflecting to create new points. 


Mordell proved his theorem in 1922. Unfortunately, the proof is too compli- 
cated for us to give in detail, but the following outline of Mordell’s proof shows 
that it is nothing more than a fancy version of Fermat’s descent method: 


3If A(E) = 0, then the cubic polynomial x* + ax? + bax + c has a double or triple root, and the 
curve F either crosses itself or has a sharp point. (See Exercise 41.7.) The discriminant A(E) will 
appear in several guises as we continue our study of elliptic curves. 
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(1) The first step is to make a list P;, Po,..., P, of “small” points on & having 
rational coordinates. 


(2) The next step is to show that if @ is any point with rational coordinates that 
is not in the list, then it is possible to choose one of the P;’s so that the line 
through P; and Q intersects E in a third point Q’ that is “smaller” than Q. 


(3) Repeating this process, we get a list of points Q, Q’, Q”, Q’”,... of decreas- 
ing size, and we show that eventually the size gets so small that we end up 
with one of the P;’s in our original list. 


Notice the similarity to our work on Pell’s equation, where we showed that any 
large solution is always the product of a smaller solution and the smallest solution. 
Of course, it’s not even clear what “larger” and “smaller” mean for points with 
rational coordinates on an elliptic curve —. This is one of the many ideas that 
Mordell had to work out before his proof was complete. 

Let’s take a look at some of the rational solutions to £,. We start with P; = 
(—2,3) and Py = (—1,4). The line through P, and P» intersects Ej in a third 
point, which we reflect about the z-axis and call P3. Next we take the line through 
the points P; and P3, intersect it with /,, and reflect across the x-axis to get P4. 
Using the line through P; and P4, we similarly get Ps, and so on. The following are 
the first few P,,’s. As you can see, the numbers get complicated with frightening 
rapidity. 


P, =(-2;3), Py=(-1;4), Ps=(4-9), P= (2,5), 


238 106 1097 —2228 —63465 
Re Ggs ar on) Gar mee) 
p, — (76271 —21063928 _ (-9776276 54874234809 
SN 98g: 4013 » “9 \ 6145441 ’ 15234548239 |? 


p,, — [3497742218 —215890250625095 
10 ~ \ 607770409 ’ 14983363893077 


We would like a quantitative way to measure the “size” of these points. One way to 
do this is to look at the numerator and denominator of the x-coordinates. In other 
words, if we write the coordinates of P,, as 
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size(P,,) 


4 

106 

2228 

JO2Z71 

9776276 

3497742218 

1160536538401 
1610419388060961 
43923749623043363812 
102656671584861356692801 
1853 18468583598078787345 15284 


OCMANIAMNBRWNYeKe!] S 


3701833357 11420357564604634095918 
125067940343620957546805016634617881761 
148038963965462958804632421208 197 17253248409 
41495337621274074603425488675302807756680196997372 
830947 1981636130322638066614339972213969861310527986699 1 


Table 41.1: The Size of Points P,, on £) 


in lowest terms, we might define the size of P,, to be* 
size(P,) = maximum of |A,,| and |By|. 

For example, 

size(P,) = max{| — 2], |1|} = 2 
and 

size(P7) = max{| — 2228], |961|} = 2228. 
The first 20 P,,’s together with their sizes are listed in Table 41.1. 

Do you see any sort of pattern in Table 41.1? The actual numbers don’t seem 

to follow a pattern, but try moving back a little bit and squinting while staring at 


the table. Imagine that the numbers are solid black boxes and look at the curve that 
separates the black area from the white area. Does it look familiar? If not, look 


“The mathematical term for what we are calling the size is the height. 
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Table 41.2: The Size of Points P,, on E 


at Table 41.2, which extends Table 41.1 up to n < 50 with the digits replaced by 
black boxes. 

The curve separating the black region from the white region looks very much 
like a parabola (lying on its side). What this means is that the number of digits in 
size(P,,) looks like cn? for some constant c. Using more advanced methods, it is 
possible to show that c is approximately 0.1974.° In other words, for large values 
of n, the size of P,, looks like 


# of digits in size(P,,) © 0.1974n?, 
size(Pn) = 1001974"?  (1.574)™. 
It is instructive to compare this with the solutions to Pell’s equation that we found 


in Chapter 32. We showed there that the size of the n™ solution (2, Yn) to Pell’s 
equation x? — 2y? = 1 is approximately 


1 
In © 3 (5-82843)". 
The exponential growth rate for Pell’s equation is quite rapid, but it pales in com- 
parison to the speed with which the points on an elliptic curve grow. 
>The value of c is computed with the theory of canonical heights developed by André Néron and 


John Tate in the 1960s. Using this theory, we can show that the ratio In(size(P,,))/n? gets closer 
and closer to 0.4546168651... as n gets larger and larger. 
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Exercises 


41.1. For each of the following pairs of points on the elliptic curve E, : y2 = x? +17, use 
the line connecting the points to find a new point with rational coordinates on F,. 

(a) The points (—1, 4) and (2, 5) 

(b) The points (43, 282) and (52, —375) 

(c) The points (—2, 3) and (19/25, 522/125) 


41.2. The elliptic curve 
E:y=22+a-1 


has the points P = (1,1) and Q = (2, —3) with rational coordinates. 

(a) Use the line connecting P and Q to find a new point R on F having rational coordi- 
nates. 

(b) Let R’ be the point obtained by reflecting R through the x-axis. [That is, if R = 
(x,y), then R’ = (x, —y).] Use the line through P and R’ to find a new point S with 
rational coordinates on LE’. 

(c) Same as (b), but use the line through Q and R’ to find a new point T. 

(d) Let S be the point you found in (b), and let S’ be the point obtained by reflecting S 
through the x-axis. What point do you get if you use the line through P and S’ to 
find a new point on £? 


41.3. Suppose that Q1, Q2, Q3,...is a list of points with rational coordinates on an elliptic 
curve EF, and suppose that their sizes are strictly decreasing, 


size(Q1) > size(Q2) > size(Q3) > size(Qa) >--- 


Explain why the list must stop after a finite number of points. In other words, explain why 
a list of points with strictly decreasing sizes must be a finite list. Do you see why this 
makes the size a good tool for proofs by descent? 


41.4. Write a short biography of Girolamo Cardano, including especially a description of 
his publication of the solution to the cubic equation and the ensuing controversy. 


41.5. (This exercise is for people who have taken some calculus.) There is another way 
to find points with rational coordinates on elliptic curves that involves using tangent lines. 
This exercise explains the method for the curve 


E:y?=2° —32+7. 


(a) The point P = (2,3) is a point on E. Find an equation for the tangent line L to the 
elliptic curve F at the point P. [Hint. Use implicit differentiation to find the slope 
dy/dz at P.] 

(b) Find where the tangent line L intersects the elliptic curve FE by substituting the equa- 
tion for L into F and solving. You should discover a new point Q with rational 
coordinates on F. (Notice that x = 2 is a double root of the cubic equation you need 
to solve. This reflects the fact that Z is tangent to F at the point where x = 2.) 
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(c) Let R be the point you get by reflecting @ across the x-axis. [In other words, if 
Q = (21,41), let R = (#1, —y1).] Take the line through P and R and intersect it 
with EF to find a third point with rational coordinates on FE. 


41.6. Let L be the line y = m(x+2)-+3 of slope m going through the point (—2, 3). This 
line intersects the elliptic curve E, : y2 = x? + 17 in the point (—2, 3) and in two other 
points. If all three of these points have rational coordinates, show that the quantity 


m* + 12m? + 24m — 12 


must be the square of a rational number. Substitute in values of m between —10 and 10 to 
find which ones make this quantity a square, and use the values you find to obtain rational 
solutions to y? = 2° + 17. 


41.7. The discriminant of each of the curves 


3 


Ci:y%=2 and Co: y? = 2° +27 


is zero. Graph these two curves and explain in what way your graphs are different from 
each other and different from the graphs of the elliptic curves illustrated in Figure 41.1. 


41.8. 2 Leta,b,cbe integers, let FE be the elliptic curve 
E:y*=2°+ax?+br+c, 


and let P, = (21, y1) and P2 = (2, y2) be points on E with coordinates that are rational 
numbers. 

(a) Let L be the line connecting P; and P2. Write a program to compute the third point 
P3; = (23,43) where the line L intersects &. (If L is a vertical line, then there 
won’t be a real third intersection point, so your program should return a warning 
message.) You should keep track of the coordinates as rational numbers; if your 
computer language won’t let you work with rational numbers directly, you’!l have 
to store a rational number A/B as a pair (A, B), in which case you should always 
cancel gcd(A, B). 


(b) Modify your program so that the output is the reflected point (73, —y3). We denote 
this point with the suggestive notation P; ® Po, since it is a sort of “addition” rule for 
the points of FL. 

(c) Let E be the elliptic curve 


E:y? =2° + 327 — 7x +3, 
and consider the points P = (2,—3), Q = (37/36, 53/216), and R = (3,6). Use 
your program to compute 
P®Q, QR, and POR. 
Next compute 
(P@®Q)eR and PO(QPR). 


Are the answers the same regardless of the order in which you “add” the points? Do 
you find this surprising? (If not, try proving that the corresponding fact is true for 
every elliptic curve.) 
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41.9. == For this exercise you will need to use a computer package that handles integers 
with arbitrarily many digits. Let E be the elliptic curve 


BE:y?=2° +8. 


Starting with the two points P; = (1,3) and P» = (—4,—43) on E, take the line 
through P; and P», find the third point where it intersects £, reflect about the x-axis and 
call the new point P3. Repeat the process with P; and P3 to get P,, then with P; and P, to 
get P;, and so on up to, say, Po9. The coordinates of the points will get very complicated, 
so have your computer program output a table with the following data (I’ve provided the 


first four entries for you): 


Insize(P,,) | Insize(P,,)/n 
0.0000 
1.9459 


6.0707 
10.3441 


Does the quantity In size(P,,) /n? appear to approach a finite nonzero limit? Does the list 
of points on & grow slower or faster than the list of points (Table 41.1) that we found on 
the curve FE, : y? = 2° +17? 


Chapter 42 


Elliptic Curves with Few 
Rational Points 


The elliptic curve EF, with equation y? = x? + 17 has lots of points with rational 
coordinates. On the other hand, the elliptic curve Ey with equation y*? = 2° +2 
appears to have very few such points. In fact, the only point that is immediately 
visible is the point (0,0). We show that this is indeed the only rational point on Fo. 


Theorem 42.1. The only point with rational coordinates on the elliptic curve 
Eo:y2=a224+2 
is the point (x,y) = (0,0). 


Proof. Suppose that (A/B,C/D) is a point on E2 with rational coordinates, where 
we write the fractions A/B and C’/D in lowest terms. In particular, we take the 
denominators B and D to be positive. Our task is to show that A = 0 and C' = 
0. Substituting s = A/B and y = C/D into the equation for EF» and clearing 
denominators, we get the equation 


C?B? = A®p? + AB2p2. (x) 


Any solution in integers to this equation (with 6 and D not zero) gives a rational 
point on £2. 

The equation (*) contains a lot of divisibility information from which we draw 
numerous conclusions. For example, factoring the right-hand side of (*) gives 


C7 B® = D*A(A? + B’), 
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so D*A divides C2 B*. However, we know that gcd(C, D) = 1, so D? must divide 
B?. Similarly, rearranging (*) and factoring gives 


A® D? = C?B? — AB*D? = B*(C*B — AD”), 


so B? divides A?D?. Since gcd(A, B) = 1, it follows that B? must divide D?, 
which of course means that B divides D. We have verified that 


D?|B®> and BID. 


Let v = D/B, so we know that v is an integer. Substituting D = Bv into the 
relation D?|B? tells us that B2v2|B%, so v?|B. In other words, we can write B as 
B = v*z for some integer z. Notice D = Bu = v*z. Substituting B = v*z and 
D = v°z into the equation (*) yields 

CAB Al Do ABD? 
C?(v2z)? = A3(u3z)? + A(v?z)?(v3z)? 
C?z = AP + Avt2?, 
and rearranging gives 


A® = C?z — Av*z* = 2(C? — Av'z). 


Thus, z divides A*. However, z also divides B and gcd(A, B) = 1, so we must 
have z = +1. On the other hand, B = v2z and we know that B is positive, so in 
fact z = 1. We now know that 


B=v and D=v’, 


so our original point (4/B,C/D) on E> looks like (A/v*, C/v3), and the equa- 
tion («) becomes 


C? = AP + Av’. 
Factoring the right-hand side, we see that 
C? = A(A? + v4). 


This is a very interesting equation, because it expresses the perfect square C? 
as the product of the two numbers A and A? + v*. I claim that these two numbers 
have no common factors. Do you see why? Well, if A and A? + v4 were to have 
a common factor, say they were both divisible by some prime p, then v would also 
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have to be divisible by p. However, A and v can’t both be divisible by p, since the 
fraction A/B = A/v? is written in lowest terms. 

So we now know that A and A? + v* have no common factors, their product is 
a perfect square, and it is clear from the formula C? = A(A? + v*) that they are 
both positive. The only way that this can happen is if each of them individually is 
a square. (Does this reasoning look familiar? We used it long ago in Chapter 2 to 
derive a formula for Pythagorean triples.) In other words, we can find integers u 
and w such that 

A=wv and A? +4 =w’. 


Substituting the value A = u? into the second equation gives 


ut +4 =u. 
Let’s recapitulate our progress. We began with some solution to the elliptic 
curve /2, which we wrote as (A/B,C'/D) in lowest terms. Starting from this 
solution, we showed that there must be integers wu, v, and w satisfying the equation 


Furthermore, given such integers u,v, w, we can recover the solution to £2 from 
the formulas A/B = u?/v? and C/D = uw/v?. Do you recognize this u, v, w 
equation? It should look familiar, since it is exactly the equation that we studied in 
Chapter 30, where we showed that the only solutions are those with either u = 0 
or v = 0. Since u = 0 leads to (A/B,C'/D) = (0,0) and v = 0 leads to zeros in 
the denominator, it follows that the only point with rational coordinates on F2 is 
the point (0,0). This completes our proof. O 


We now turn to our third representative elliptic curve 
E3:y? = 2° — 427 + 16. 
A brief search reveals four points on £3, 
P= (054); Peaie4), 3] (0-4), Pea (4, —4). 


What happens if we use these four points and play the same game that we played 
on EF? The line connecting P; and P, has equation y = 4. To find where this line 
intersects £3, we substitute y = 4 into the equation of £3 and solve for x: 


4? = 7? — dy? + 16, 


0 = 2° — 4¢7 = 27 (2 — 4). 
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Notice that x = 0 is a double root, so the line connecting P; and P> intersects £3 
only at P; and P2. The game is lost; we’ve failed to find any new points. A similar 
thing happens if we choose any two of P;, P2, P3, P4, connect them with a line, 
and compute the intersection of the line with E'3. We never find any new points. In 
fact, it turns out that the only points on £3 with rational coordinates are the four 
points P;, P2, P3, Py. (Unfortunately, the proof is too long for us to give.) 

More generally, a finite collection of points 


oe ee 
(with ¢ > 3) on an elliptic curve 
E:y?=2°2+ax*+br+ce 


is called a torsion collection if, whenever you draw a line L through two of the P;’s, 
all intersection points of L and F are already in the collection. Another way to say 
this is that a torsion collection cannot be enlarged using the geometric method of 
taking lines and intersections. For example, £3 has the torsion collection consisting 
of the four points (0, +4), (4,+4). The following important theorem describes 
torsion collections. 


Theorem 42.2 (Torsion Theorem). Let FE : y* = x? + ax? + bx + c be an elliptic 
curve with integer coefficients a,b, c, and let P,, Po,..., P be a torsion collection 
on FE, consisting of points whose coordinates are rational numbers. Also let 


A(E) = —4a°c + a7b* — 40° — 27c? + 18abc 


be the discriminant of E, and suppose that A(E) # 0. 

(a) (Nagell-Lutz Theorem, 1935/37) Write the coordinates of each P; as P; = 
(x7, yi). Then all the x;’s and y;’s are integers. Furthermore, if y; 4 0, then 
yj |16A(£). 

(b) (Mazur’s Theorem, 1977) A torsion collection can contain at most 15 points. 


The Nagell—Lutz portion of the Torsion Theorem says that points in a torsion 
collection have integer coordinates. We’ve also seen examples of points with inte- 
ger coordinates that do not lie in a torsion collection, such as the point (—2, 3) on 
the curve E, : y? = «° + 17. Our investigations have unearthed quite a few points 
with integer coordinates on F, including 


(—2,+3), (—1,44), (2,45), (4,+9), 
(8,+23), (43,4282), (52, +375). 
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We know that the curve £; has infinitely many points with rational coordinates, 
so there’s no reason why it shouldn’t be possible to extend this list indefinitely. 
Continuing our search, we soon find another integer point on F}, 


(5234, +378661), 


but after that we find no others, even if we search up to x < 10/°°. Eventually, we 
begin to suspect that there are no other integer points on £;. This turns out to be 
true; it is a special case of the following fundamental result. 


Theorem 42.3 (Siegel’s Theorem). (C.L. Siegel, 1926) Let E be an elliptic curve 
E:y=2>+a0?+br+e 


given by an equation whose coefficients a, b, and c are integers and with discrimi- 
nant A(E) # 0. Then there are only finitely many solutions in integers x and y. 


Siegel actually gave two very different proofs of his theorem. The first, pub- 
lished in 1926 in the Journal of the London Mathematical Society,! works directly 
with the equation for & and uses factorization methods. The second proof, pub- 
lished in 1929, begins with Mordell’s theorem and uses the geometric method for 
generating new points from old points. Ultimately, however, both proofs rely on 
the theory of Diophantine approximation (Chapter 33), specifically on advanced 
results which say that certain numbers cannot be closely approximated by rational 
numbers. 


Exercises 


42.1. A Pythagorean triple (a, b, c) describes a right triangle whose sides have lengths that 
are integers. We will call such a triangle a Pythagorean triangle. Find all Pythagorean 
triangles whose area is twice a perfect square. 


42.2. (a) Let E be the elliptic curve F : y? = x? +1. Show that the points (—1, 0), (0,1), 
(0, —1), (2,3), (2, —3) form a torsion collection on EF. 

(b) Let EF be the elliptic curve EF : y? = 2° — 43x + 166. The four points (3, 8), 
(3, —8), (—5, 16), and (—5, —16) form part of a torsion collection on E’. Draw lines 
through pairs of these points and intersect the lines with F to construct the full torsion 
collection. 


(c) Let F be an elliptic curve given by an equation 
y° = (w — a)(x — B)(a@— 9). 
Verify that the set of points (a, 0), (8,0), (7, 0) is a torsion collection. 


‘In both England and Germany in the 1920s, there was still a great deal of lingering bitterness 
from World War I, so Siegel published his article using the pseudonym “X”’. 
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42.3. How many integer solutions can you find on the elliptic curve 
y* = «° — 16x + 16? 
42.4. This exercise guides you in proving that the elliptic curve 
E:y?=2°+7 


has no solutions in integers x and y. (This special case of Siegel’s Theorem was originally 
proven by V.A. Lebesgue in 1869.) 
(a) Suppose that (x, y) is a solution in integers. Show that x must be odd. 
(b) Show that y? + 1 = (2+ 2)(x? — 27 + 4). 
(c) Show that x? — 22 + 4 must be congruent to 3 modulo 4. Explain why x? — 2% + 4 
must be divisible by some prime q satisfying g = 3 (mod 4). 
(d) Reduce the original equation y? = x? +7 modulo q, and use the resulting congruence 
to show that —1 is a quadratic residue modulo qg. Explain why this is impossible, 
thereby proving that y? = x? + 7 has no solutions in integers. 


42.5. The elliptic curve E : y2 = x? — 2x + 5 has the four integer points P = (—2, +1) 
and @) (1) 42); 
(a) Find four more integer points by plugging in z = 2,3,4,... and seeing if x? -274+5 
is a square. 
(b) Use the line through P and Q to find a new point R having rational coordinates. 
Reflect R across the x-axis to get a point R’. Now take the line through Q and R’ 
and intersect it with F to find a point with rather large integer coordinates. 


42.6. (a) Show that the equation y? = x? + x? has infinitely many solutions in integers 
x,y. [Hint. Try substituting y = tz.] 
(b) Does your answer in (a) mean that Siegel’s Theorem is incorrect? Explain. 
(c) Show that the equation y? = x* — x* —x+1 has infinitely many solutions in integers 


DA: 


42.7. Let E : y? = 2? + az? + br + ¢ be an elliptic curve with a, b, and c integers. 
Suppose that P = (4, ¢) is a point on F whose coordinates are rational numbers, written 
in lowest terms with B and D positive. Prove that there is an integer v such that B = v? 


and D = v’. 
42.8. == Write a program to search for all points on the elliptic curve 
E:y=2 +a2?+br+ce 


such that x is an integer and |x| < H. Do this by trying all possible x values and checking 
if 2° + ax? + bx + cis a perfect square. 
Test your program on the curve 


y? = x? — 1122 + 400. 


How many integer points do you find with H = 100? H = 1000? AH = 10000? H = 
100000. 
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42.9. == (a) Write a program to search for points on the elliptic curve 


E:y* =2°+az*+br+ce 


such that x and y are rational numbers. Exercise 42.7 says that any such point must 
look like (x, y) = (A/D?, B/D®), so the user should input an upper bound H and 
your program should loop through all integers |A| < H and1 < D< WV H and check 
if 
A® + aA? D* + bAD* + cD® 
is a perfect square. If it equals B?, then you’ve found the point (A/D?, B/D?). 
(b) Use your program to find all points on the elliptic curve 


y? = 0° — 227 + 34 -2 


whose x-coordinate has the form x = A/D? with |A| < 1500 and 1 < D < 38. 


Chapter 43 


Points on Elliptic Curves 
Modulo p 


It can be quite difficult to solve a Diophantine equation. So rather than trying to 
solve using integers or rational numbers, we treat the Diophantine equation as a 
congruence and try to find solutions modulo p. This is a far easier task. To see 
why, consider the following example. 

How might we find all solutions “modulo 7” to the equation 


ag? +y*=1? 
In other words, what are the solutions to the congruence 
a? y* = 1 (mod 7)? 


This question is easy; we can just try each pair (x,y) with 0 < x,y < 6 and see 
which ones make the congruence true. Thus, (1,0) and (2, 2) are solutions, while 
(1,2) and (8, 2) are not solutions. The full set of solutions is 


(ORI) (0. Br O)102; 2a 2 co Necoee) Conroe (Onl). 


We conclude that the equation x? + y? = 1 has 8 solutions modulo 7. Similarly, 
there are 12 solutions modulo 11, 


(OF (0500). C10 (340) 2(36).(5v8)s (ose), 
(6,3),(678), (8,5) (86 )s (10, 0). 


Now we look at some elliptic curves and count how many points they have 
modulo p for various primes p. We begin with the curve 


Eo: yr =ae+u 
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p | Points modulo pon Fo: y? =224+2 


ia 
19 


Table 43.1: Points Modulo p on EF» 


whose lone rational point is (0,0). However, as Table 43.1 indicates, E’2 tends to 
have lots of points modulo p. In the last column of Table 43.1 we have listed N,,, 
the number of points modulo p. 

The number of points modulo p on an elliptic curve exhibits many wonderful 
and subtle patterns. Look closely at Table 43.1. Do you see any patterns? If 
not, maybe some more data would help. Table 43.2 gives the number of solutions 
modulo p without bothering to list the actual solutions. 

One partial pattern that immediately strikes the eye is that there are many 
primes for which N, is equal to p. This occurs for the primes 


p = 2,3,7, 11,19, 23, 31, 43, 47, 59,67, and 71, 


which is surely too often to be entirely random. Indeed, aside from the initial 
entry 2, this list is precisely the set of primes (less than 71) that are congruent to 3 
modulo 4. So we are led to make the following guess: 


Guess. If p = 3 (mod 4), then the elliptic curve Ey : y? = 2° +z has 
exactly N, = p points modulo p. 
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Be NeSie |< 87 A aie (435 47 all 53; | 802 eet sor er 
| Np |) 31 85 Ste) 4g" ar: er So. Psat 6r2|| "7a 


Table 43.2: The Number of Points NV, on £2 Modulo p 


What about the other primes, those congruent to 1 modulo 4? The N,,’s in this case 
look fairly random. Sometimes JN, is less than p, such as for p = 5 and 17, while 
sometimes JV, is greater than p, such as for p = 13 and 53. However, it also seems 
to be true that as p gets larger N, also becomes larger. In fact, N, is usually found 
hovering in the general neighborhood of p. A little thought suggests why this is 
very reasonable. 

In general, if we are trying to find the solutions modulo p to an elliptic curve 


yi=a>+azr*+brt+e, 
we substitute x = 0,1,2,...,p — 1 and check, for each x, whether 
a + ax? + br +e 


turns out to be a square. It is reasonable to suppose that the values we get for 
x® + ax? + bx + c are essentially randomly distributed, so we would expect the 
values to be squares about half the time and to be nonsquares about half the time. 
This follows from the fact, proved in Chapter 20, that half the numbers from 1 to 
p — 1 are quadratic residues and the other half are nonresidues. We also observe 
that if 2? + ax? + br + chappens to be a square, say it is congruent to t? (mod p), 
then there are two possible values for y: y = t and y = —t. In summary, approxi- 
mately half of the values of x give two solutions modulo p, and about half give no 
solutions modulo p, so we would expect to find approximately 2 x SP = p solu- 
tions. Of course, this argument doesn’t prove that there are always exactly p solu- 
tions; it merely gives a hint why the number of solutions should be more or less in 
the neighborhood of p. 

All this suggests that it might be interesting to investigate the difference be- 
tween p and N,. We write this difference as 


ay =p—Np 
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Table 43.3: The p-Defect ay = p — Np for E2 


and call it the p-defect of E2.! Table 43.3 lists the p-defects for the elliptic curve Eo. 
This table exhibits a subtle pattern that is closely related to a topic that we studied 
earlier. Take a few minutes to see if you can discover the pattern yourself before 
reading on. 

During our number theoretic investigations, we found that the primes that are 
congruent to 1 modulo 4 display many interesting properties. One of the most 
striking was our discovery in Chapter 24 that these are the primes that can be 
written as a sum of two squares p = A? + B?. For example, 


5= 17427, 13=37427, 17=174+47, and 29=57+2?. 


Furthermore, Legendre’s theorem in Chapter 36 tells us that if we require A to be 
odd and A and B both positive, then there is only one choice for A and B. [In the 
notation of Theorem 36.5, R(p) = 8(D ; — D3) = 8, where the 8 is accounted for 
by switching A and B and/or changing their signs.] Compare these formulas with 
the values a5 = 2, aj3 = —6, ay7 = 2, and agg = 10. Now do you see a pattern? It 
looks as if, when we write p = A? + B? with A positive and odd, Gp is either 2A 
or —2A. Another way to say this is that it appears that the quantity p — (ap/2)? is 
always a perfect square. We check this for a few more values of p: 


53 — (asg/2)° = 27, 73 —(a73/2)* = 87, 193 — (a193/2)* = 127. 
Amazingly, the pattern continues to hold. 
'The actual mathematical name for the quantity a, is the trace of Frobenius, a terminology whose 
full explication is unfortunately beyond the scope of this book. However, if you wish to impress 


your mathematical friends or kill a conversation at a cocktail party, try casually venturing a remark 
concerning “the trace of Frobenius acting on the £-adic cohomology of an elliptic curve.” 
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p || 97| 101 | 109 | 113 | 137 | 149 | 157 | 173 | 181 | 193 
}a,/2|| 9 | 1 | -3]-7|-11|-7 |-11| 13] 9 | -7 | 


Table 43.4: The Value of a,/2 for Eo 


One question remains. When does ap = 2A and when does a, = —2A? 
Looking at Table 43.3, we see that 


Gp =2A for p=5,17,29,37,41, 61, 89,97, 101, 173, 181, 
dy =—2A for p= 13,53, 73,109, 113, 137, 149, 157, 193. 


These two lists don’t seem to follow any orderly pattern. However, if we look at 
the values of a,/2 listed in Table 43.4, a pattern emerges. 

Every a,/2 value is congruent to 1 modulo 4. So if we write p = AP an B? 
with A positive and odd, then ay = 2A if A = 1 (mod 4) and a, = —2A if 
A = 3 (mod 4). The following statement summarizes all our conclusions. 


Theorem 43.1 (The Number of Points Modulo p on E2 : y? = 2° + 2). 
Let p be an odd prime, and let N, denote the number of points modulo p on the 
elliptic curve Ey: y* = 2° + 2. 
(a) [fp = 3 (mod 4), then Np = p. 
(b) If p = 1 (mod 4), write p = A? + B? with A positive and odd. (We know 
from Chapter 24 that this is always possible.) Then Ny = p + 2A, where the 
sign is chosen negative if A= 1 (mod 4) and positive if A = 3 (mod 4). 


The first part is comparatively easy to verify, but we omit the proof because we 
will be proving a similar result later. The second part is considerably more difficult, 
SO we are content to illustrate it with one more example. The prime p = 130657 is 
congruent to 1 modulo 4. Using trial and error, a computer, or the method described 
in Chapter 24, we write 130657 = 111? + 344? as a sum of two squares. Now 
111 = 3 (mod 4), so we conclude that Fy has 130657 + 2 - 111 = 130879 points 
modulo 130657. 

Next we look at our old friend, the elliptic curve 


E, :y? = 2° +17. 
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p || 3 


1 at al 
Pe [os [ar 


Table 43.5: The Number of Points Modulo p and Defect ap for £1 


Just as we did for EH, we make a table giving the number of points N, on F 
modulo p and the defect ay = p — Np. The values are listed in Table 43.5. 
Again there are many primes for which the defect ap is zero: 


p = 2,3,5,11, 17, 23, 29, 41, 47, 53, 59, 71, 83, 89, 101, 107, 113. 


These primes don’t follow any pattern modulo 4, but they do follow a pattern mod- 
ulo 3. Aside from 3 itself, they are all congruent to 2 modulo 3. So we might guess 
that if p = 2 (mod 3), then N, = p. We can use primitive roots to verify that this 
guess is correct. 


Theorem 43.2. [f p = 2 (mod 3), then the number of points Np on the elliptic 
curve 

Ey, :y? =2° +17 modulo p 
satisfies Ny = Pp. 
Proof. Before trying to give a proof, let’s look at an example. We take the prime 
p = 11. To find the points modulo 11 on EF, we substitute x = 0,1,..., 10 into 


x + 17 and check if the value is a square modulo 11. Here’s what happens when 
we substitute: 


x? (modl1) || 0/1 


z°® + 17 (mod 11) 
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Notice that the numbers x® (mod 11) are just the numbers 0,1,..., 10 rearranged, 
and the same for the numbers x° + 17 (mod 11). So when we look for solutions 
to 


y? =0° +17 (mod 11), y?=1°+17 (mod 11), y? = 2° +17 (mod 11), 
y= 3° 1% (mod 11), pee y* = 10° + 17 (mod 11), 


we’re really just looking for solutions to 


y- = 0 (mod 11), y* = 1 (mod 11), y* = 2 (mod 11), 
y* =3 (mod 11), oe y” = 10 (mod 11), 
The first congruence, y? = 0 (mod 11), has one solution, y = 0 (mod 11). As 


for the other 10 congruences, we know from Chapter 20 that half of the numbers 
from 1 to 10 are quadratic residues modulo 11, and the other half are nonresidues. 
So half of the congruences y? = a (mod 11) have two solutions (remember that 
if b is a solution then so is p — b), and half of them have no solutions. So overall 
there are 1 + 2-5 = 11 solutions. 

If you try a few more examples, you’! find that the same phenomenon occurs. 
Of course, you must stick with primes p = 2 (mod 3); the situation is entirely 
different for primes p = 1 (mod 3), as you can check for yourself by computing 
oo tr (imod: ¢) tora = 0, 120% 0; 

So we try to show that if p = 2 (mod 3) then the numbers 


O17. 1 HIT 28 4-17,43 = 1)? £17 (mod'p) 
are the same as the numbers 
0,1,2,...,p—1 (mod p) 


in some order. Notice that each list contains exactly p numbers. So all that we need 
to do is show that the numbers in the first list are distinct, since that will imply that 
they hit all the numbers in the second list. 

Suppose we take two numbers from the first list, say b? + 17 and b3 + 17, and 
suppose that they are equal modulo p. In other words, 


b? +17 =63 +17 (mod p), SO b? = b3 (mod p). 


We want to prove that b} = bo. If b} = 0 (mod p), then bg = 0 (mod p), and 
vice-versa, SO we may as well assume that b; # 0 (mod p) and bp # 0 (mod p). 
We would like to take the cube root of both sides of the congruence 


bj = b3 (mod p), 
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but how? The answer is to apply Fermat’s Little Theorem b?~! = 1 (mod p). We 
also make use of the assumption that p = 2 (mod 3), which tells us that 3 does 
not divide p — 1. Thus 3 and p — 1 are relatively prime, so the Linear Equation 
Theorem (Chapter 6) says that we can find a solution to the equation 

3u —(p—1)v=1. 
In fact, it’s easy to write down a solution, u = (2p — 1)/3 and v = 2. Of course, 
(2p — 1)/3 is an integer because p = 2 (mod 3). 

Notice that 3u = 1 (mod p — 1), so in some sense raising to the u" power is 
the same as raising to the 1/3 power (i.e., taking a cube root; you may recognize 
that we developed this idea in a more general setting in Chapter 17). So we raise 
both sides of the congruence 6? = b3 (mod p) to the u™ power and use Fermat’s 
Little Theorem to compute 


(b3)" = (03)" (mod p) 
be = b3” (mod p) 
Beene = ge -Ve (mod p) 
by - (bP-1)” = by - (bB1)” (mod p) 
b; = be (mod p). 
This proves that the numbers 0° + 17, 13 + 17,... , (p — 1)? + 17 are all different 


modulo p, so they must equal 0, 1,..., — 1 in some order. 
To recapitulate, we have shown that if we substitute 


f= 012 ee pal 
into x? + 17 (mod p), we get back precisely the numbers 
0,1,2,...p—1 (mod p). 


The congruence y? = 0 (mod p) has one solution: y = 0 (mod p). On the other 
hand, half of the congruences 


y?=1(modp), y2=2(modp), y?=3(modp),..., 
y? =p—2(modp), y? =p—1(mod p) 


have two solutions, and the other half have no solutions, since half of the numbers 
are quadratic residues and the other half are nonresidues (see Chapter 20). Hence, 
the Diophantine equation y? = x? + 17 has exactly 


all 


solutions modulo p. L] 
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Table 43.6: The Number of Points Modulo p and Defect a, for £3 


We now understand what happens on £ modulo p for primes p = 2 (mod 3). 
Exercise 43.3 asks you to discover a far more subtle pattern lurking in the a,’s 
when p = 1 (mod 3). 

Let’s pause to review the patterns we’ve discovered. For the elliptic curves £1 
and E>, we’ve found that the p-defect a, equals 0 for about half of all primes, and 
we’ve been able to describe quite precisely these 0-defect primes. For the other 
primes we’ve seen that the a,’s satisfy a more subtle pattern involving squares; 
that is, p — (ap/ 2)? is a perfect square for Hj, and something similar for E, (see 
Exercise 43.3). Of course, &; and 2 are only two elliptic curves among the myr- 
iad, so having discovered common patterns for &; and 2, we should certainly 
investigate at least one or two more examples. Table 43.6 gives the number of 
points modulo p and the p-defects for the elliptic curve 


E3:y” = 2° — 4x? + 16. 


Alas and alack, it seems that there are very few primes for which the p-defect a, 
is 0. Even if we extend our Table 43.6, we find that the only primes p < 5000 with 
Ap = 0 are 


p = 2,19, 29, 199, 569, 809, 1289, 1439, 2539, 3319, 3559, 3919. 


These primes do all happen to be congruent to 9 modulo 10, but unfortunately there 
are lots of 9 mod 10 primes, such as 59, 79, 89, and 109, that are not in the list. 


[Chap. 43] Points on Elliptic Curves Modulo p 382 


There doesn’t appear to be a simple pattern governing which primes are in the list, 
and indeed no one has been able to find a pattern. It wasn’t until 1987 that Noam 
Elkies was able to show that there are always infinitely many primes p for which 
Ap = 0. 

Lacking primes with ap = 0, we might try looking for patterns involving 
squares; but again we search in vain, and no pattern emerges. In fact, what we 
find if we look at other elliptic curves is that most of them are like £3, with very 
few a,’s being 0 and no patterns involving squares. The elliptic curves £; and E2 
are of a very special type; they are elliptic curves with complex multiplication.” 
We do not give the precise definition, but only say that elliptic curves with com- 
plex multiplication have half of their a,’s equal to 0, while elliptic curves without 
complex multiplication have very few of their a,’s equal to 0. 


Exercises 


43.1. (a) For each prime number p, let M, be the number of solutions modulo p to the 
equation x” + y* = 1. Figure out the values of M3, Ms, Mj3, and My7. [Hint. 
Here’s an efficient way to do this computation. First, make a list of all of the squares 
modulo p. Second, substitute in each 0 < y < p and check if 1 — y? is a square 
modulo p.] 

(b) Use your data from (a) and the values M7 = 8 and M,,; = 12 that we computed ear- 
lier to make a conjecture about the value of M,. Test your conjecture by computing 
Mg. According to your conjecture, what is the value of M1373? of M1987? 

(c) Prove that your conjecture in (b) is correct. [Hint. Formulas in Chapter 3 might be 
helpful. ] 


43.2. (a) Find all solutions to the Diophantine equation y2 = x° + 1 modulo 7. How 
many solutions are there? 
(b) Find all solutions to the Diophantine equation y? = x° + 1 modulo 11. How many 
solutions are there? 
(c) Let p be a prime with the property that p 1 (mod 5). Prove that the Diophantine 
equation y? = x° + 1 has exactly p solutions modulo p. 


43.3. For each prime p = 1 (mod 3) in the table for E,, compute the quantity 4p — az. 
Do the numbers you compute have some sort of special form? 


43.4. & Write a program to count the number of solutions of the congruence 
E:y? =2? + ax? + br +c (mod p) 
*An elliptic curve has complex multiplication if its equation satisfies a certain special sort of 
transformation property. For example, if (x,y) is a solution to the equation Ep : y? = 2° 4+ a, 


then the pair (—2, ty) will also be a solution. The presence of numbers such as i = \/—1 in these 
formulas led to the name “complex multiplication.” 
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using one of the following methods: 


(i) First make a list of the squares modulo p, then substitute z = 0,1,...,p — 1 into 
x? + ax? + br + ¢ and look at the remainder modulo p. If it is a nonzero square, 
add 2 to your list, if itis zero, add 1 to your list, and if is is not a square, ignore it. 


(ii) For each x = 0,1,...,p— 1, compute the Legendre symbol (fae ets), If it 
is +1, add 2 to your list; if itis —1, ignore it. [And if r?+ax?2+br+c = 0 (mod p), 
then just add 1 to your list.] 

Use your program to compute the number of points N, and the p-defect a, = p — Ny, for 


each of the following curves and for all primes 2 < p < 100. Which curve(s) do you think 
have complex multiplication? 


(a) yy=o? +e? -—32r411 (c) y2 =a? +4274 2r 
(b) y? = 2° — 595a¢ + 5586 (dd) y2=a23+2x—-7 


43.5. In this exercise you will discover the pattern of the p-defects for the elliptic curve 
E:y? =2° +1. To assist you, I offer the following list. 


PIR TE Eee ee ae 


p || 73 | 79 | 83| 89 | 97] 101 | 103 | 107 | 109 | 113 
| a || -10| -4] 0 | 0 [14] 0 | 20] 0 | 2 | Oo 


The Defect a, for the Elliptic Curve E : y? = 2? +1 


(a) Make a conjecture as to which primes have defect a, = 0, and prove that your 
conjecture is correct. 

(b) For those primes with a, 4 0, compute the value of 4p — Ge and discover what is 
special about these numbers. 

(c) For every prime p < 113 with p = 1 (mod 3), find all pairs of integers (A, B) that 
satisfy 4p = A? + 3B?. (Note that there may be several solutions. For example, 
4-7 = 28 equals 5* +3-12 and 42+ 3-27. An efficient way to find the solutions is to 
compute 4p — 3B? for all B < ,/4p/3 and pick out those values for which 4p — 3B? 
is a perfect square.) 

(d) Compare the values of A and B with the values of a, given in the table. Make as 
precise a conjecture as you can as to how they are related. 

(e) For each of the following primes p, I have given you the pairs (A, B) satisfying 
4p = A? + 3B”. Use your conjecture in (d) to guess the value of ay. 

Gey = 541. A,B) = (AGA), (20.21). 725) 
Gi) p= 2029" (Ay 8) = (79; 20),(7 1, 21), (2, 52) 
(iii) p = 8623 (A,B) = (173,39), (145, 67), (28, 106) 


Chapter 44 


Torsion Collections Modulo p 
and Bad Primes 


In the last chapter we found simple patterns for the p-defects of E, and £2, but 
there did not seem to be any similar pattern for £3. However, the N,,’s for £3 do 
exhibit a pattern that you may have already noticed. If not, take a moment now to 
look back at Table 43.6 and try to discover the pattern for yourself before reading 
on. 

It appears that the N,,’s for £3 have the property that 


Np, =4 (mod 5) for all primes except p = 2 and p = 11. 


Although we won’t give a full verification of this property, we can at least give 
some idea why it is true. Recall from Chapter 42 that /3 has a torsion collection 
consisting of the four points 


Pe OD P= (0a Pee SPS). 


This means that the lines connecting any two of these points do not intersect £3 
in any additional points. The method of taking pairs of points on an elliptic curve, 
connecting them with a line, and intersecting with the curve can all be done using 
equations without any reference to geometry. This means that we can use the same 
method to find points modulo p! 

Let’s look at an example. The point Q = (1, 8) is a solution to 


y? = 2° — 4x" + 16 (mod 17). 


The line through Q and P; = (0,4) is y = 42 + 4. Substituting the equation of the 
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line into the equation of the elliptic curve gives 


(4a + 4)? = 2? — 4? + 16 (mod 17) 
a° — 327 + 22 = 0 (mod 17) 
x(x — 1)(a — 2) = 0 (mod 17). 


So we get the two known points Q and P; with x-coordinates x = 0 and z = 1, 
and we also get a new point with xz = 2. Substituting x = 2 into the equation of 
the line gives y = 12, so we have found a new solution (2, 12) for £3 modulo 17. 

If we try the same idea with the points Q = (1,8) and P3 = (4,4), we get the 
line y = —(4/3)a + 28/3. How can we make sense of this line modulo 17? Well, 
the fraction —4/3 is just the solution to the equation 3u = —4. So the number 
“4/3 modulo 17” is the solution to the congruence —3u = 4 (mod 17). We 
know how to solve such congruences; in this case, the answer is u = 10. Similarly, 
“28/3 modulo 17” is 15, so the line through Q = (1,8) and P3; = (4,4) modulo 17 
is y = 10z + 15. Now we substitute into the equation of £3 and solve as before to 
find a new solution (14,2) on £3 modulo 17. 

We can also do the same thing with Q and P», giving the solution (11, 9), 
and with Q and P,, giving the solution (15,3). Thus, starting with the single 
solution Q, we used the four points in the torsion packet to find four more solutions. 

Now consider the curve £3 modulo p for any prime p. It already has the four 
points P;, Pz, P3, P,. Each time we find another point Q on E3 modulo p, we can 
take the line LZ; connecting Q to each of the P;’s. Each line LD; intersects £3 in a 
new point Q,. In this way we get four additional points Q1, Q2, Q3, Q4 to go with 
the original point @). Thus, points on £3 modulo p come in bundles of five, except 
that there are only four P;’s. Hence 


{ Solutions to \ a { The 4 ae, { Bundles cone) 


E3modulop{ | P,, Po, P3, Ps 5 solutions each 


Therefore, the total number of solutions to £3 modulo p is equal to 4 plus a multiple 
of 5; that is, N, = 4 (mod 5). This is true for all primes except p = 2 and p = 11. 
(For p = 2 and p = 11, some of the bundles of 5 points contain repetitions.) 

The congruence N, = 4 (mod 5) also explains our earlier observation about 
the primes with a, = 0. To see why, suppose that a, = 0. Then 


p = Np = 4 (mod 5). 


Furthermore, p is odd, so p = 9 (mod 10). This shows that if a, = 0 then p is 
9 modulo 10; but it does not say that every 9 modulo 10 prime has a, = 0. This is 
an important distinction that stands in sharp contrast to our results for £, and Eo. 
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The preceding argument is fine, but what about the primes p = 2 and p = 11 
that do not follow the pattern? It turns out that 2 and 11 are somewhat special for 
the elliptic curve F3 : y? = x? — 4x? + 16. The reason they are special is because 
they are the only primes for which the polynomial «7° — 4? + 16 has a double or 
triple root modulo p. Thus, 


a? — 4c” + 16 = x? (mod 2) has a triple root x = 0, and 
a — 4¢* +16 = (2 + 1)*(2 +5) (mod 11) has a double root x = —1. 


In general, we say that p is a bad prime for an elliptic curve 
E:y=2>+a07+br+c 


if the polynomial x* + ax? + bx + c has a double or triple root modulo p. It is 
not hard to find the bad primes for FE, since one can show that they are exactly the 
primes that divide the discriminant of E,' 


A(E) = —4a°c + a®b? — 4b8 — 27¢? + 18abce. 
For example, 


A(E,) = —7803 = —3° - 17°, 
A(E2) = —4 = —2?, 
A(E3) = —2816 = —2° - 11. 


Exercises 


44.1. Suppose that the elliptic curve F has a torsion collection consisting of the ¢ points 
P,, Po,..., P. Explain why the number of solutions to EF modulo p should satisfy 


N, =t (mod t+ 1). 


3 _ x has a torsion collection 


44,2. Exercise 42.2(c) says that the elliptic curve E : y? =z 
{(0, 0), (1,0), (—1, 0)} containing three points. 
(a) Find the number of points on EF modulo p for p = 2,3,5, 7,11. Which ones satisfy 
Np = 3 (mod 4)? 
(b) Find the solutions to F modulo 11, other than the solutions in the torsion collection, 
and group them into bundles of four solutions each by drawing lines through the 


points in the torsion collection. 


'We’ve cheated a little bit in our description of the bad primes, since for various technical reasons 
the prime 2 is always bad for our elliptic curves. However, it is sometimes possible to turn a bad 
prime into a good prime by using an equation for F that includes an xy term or a y term. 
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44.3. This exercise investigates the values of a, for the bad primes. 
(a) Find the bad primes for each of the following elliptic curves. 
i) E:y=234+27-274+2 
(ii) ay? S22 3a 
Gii) FE: y? =2°+227+24+3 
(b) For each curve in (a), compute the p-defects a, for its bad primes. 


(c) Here are a few more sample elliptic curves, together with a list of the p-defects for 
their bad primes. 


A@)__ | _ ap forbad prime 


PEO 


y? = 2° + 21x? + 372 + 42 —31 - 83 - 239 


Several patterns, of varying degrees of subtlety, are exhibited by the p-defects of bad 
primes. Describe as many as you can. 


44.4. For this exercise, p is a prime greater than 3. 

(a) Check that the elliptic curve y? = x° + p has p as a bad prime. Figure out the value 
of a,. Prove that your guess is correct. 

(b) Check that the elliptic curve y? = x° + x? + p has pas a bad prime. Figure out the 
value of a». Prove that your guess is correct. 

(c) Check that the elliptic curve y2 = x° — x? + p has p as a bad prime. Figure out 
the value of ap. Prove that your guess is correct. [Hint. For (c), the value of a, will 
depend on p.] 


Chapter 45 


Defect Bounds and Modularity 
Patterns 


Chapters 43 and 44, despite their length, have barely begun to scratch the surface 
of the wonderful patterns lurking in elliptic curves modulo p. In this chapter we 
continue the investigation. 

We have already indicated why the number of points N, on an elliptic curve 
modulo p should be approximately equal to p, and we have found many patterns 
for the p-defect ay = p — Np. How might we quantify the statement that “NV, is 
approximately p”’? We could say that “a, tends to be small,” but this just raises the 
question of how small. Looking at the tables for £1, Ho, and /3 in Chapter 43, 
it seems that a, can get fairly large when p is large. One thing we might do is 
study the relative size of p and ay. Table 45.1 lists those primes p for which the 
p-defect on £3 seems to be particularly large, either positively or negatively. For 
comparison purposes, we have also listed the values of ,/p, 3/p, and log(p). 

It is clear from Table 45.1 that although the a,’s are indeed much smaller 
than p, they can grow to be much larger than ~/p and log(p). The a,’s are also 
larger than ,/p, but as you will observe, they are never twice as large. In other 
words, it appears that |a,| is never more than 2,/p. 


Theorem 45.1 (Hasse’s Theorem). (H. Hasse, 1933) Let N, be the number of 
points modulo p on an elliptic curve, and let ay = p — Np be the p-defect. Then 


Jap| < 2/p. 


In other words, the number of points NV, on an elliptic curve modulo p is ap- 
proximately equal to p, with an error of no more than 2,/p. This beautiful result 
was conjectured by Emil Artin in the 1920s and proved by Helmut Hasse during 
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ap Pp JP </p log(p) 
~30 | 239 | 15.45962 | 6.20582 
40 | 439 | 20.95233 | 7.60014 | 2.64246 
44 | 593 | 24.35159 | 8.40140 | 2.77305 
50 | 739 | 27.18455 | 9.04097 | 2.86864 
53 | 797 | 28.23119 | 9.27156 | 2.90146 
—52 | 827 | 28.75761 | 9.38646 | 2.91751 | 
68 | 1327 | 36.42801 | 10.98897 | 3.12287 | 
—72 | 1367 | 36.97296 | 11.09829 | 3.13577 | 
—68 | 1381 | 37.16181 | 11.13605 | 3.14019 
—70 | 1429 | 37.80212 | 11.26360 | 3.15503 
—71 | 1453 | 38.11824 | 11.32631 | 3.16227 | 
78 | 1627 | 40.33609 | 11.76149 | 3.21139 
84 | 2053 | 45.31004 | 12.70953 | 3.31239 
89 | 2083 | 45.63989 | 12.77114 | 3.31869 | 
—86 | 2113 | 45.96738 | 12.83216 | 3.32490 | 
—91 | 2143 | 46.29255 | 12.89261 | 3.33102 | 
93 | 2267 | 47.61302 | 13.13663 | 3.35545 | 
—98 | 2551 | 50.50743 | 13.66376 | 3.40671 | 
—103 | 3221 | 56.75385 | 14.76829 | 3.50799 
114 | 3733 | 61.09828 | 15.51265 | 3.57206 
—123 | 4051 | 63.64747 | 15.94119 | 3.60756 
129 | 4733 | 68.79680 | 16.78980 | 3.67514 
—132 | 4817 | 69.40461 | 16.88854 | 3.68278 
132 | 5081 | 71.28113 | 17.19160 | 3.70595 
138 | 5407 | 73.53231 | 17.55168 
—146 | 5693 | 75.45197 | 17.85584 
—138 | 5711 | 75.57116 | 17.87464 | 3.75671 
—147 | 6317 | 79.47956 | 18.48575 | 3.80051 
—146 | 6373 | 79.83107 | 18.54021 | 3.80434 
164 | 7043 | 83.92258 | 19.16840 | 3.84776 
153 | 7187 | 84.77618 | 19.29816 
162 | 7211 | 84.91761 | 19.31962 


Table 45.1: Large Values of a, for the Curve E3 : y? = x° — 4x? + 16 
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the 1930s. A generalized version was proved by André Weil in the 1940s, and this 
was again vastly generalized by Pierre Deligne in the 1970s. 

The proof of Hasse’s Theorem for general elliptic curves is beyond our present 
means, but we can at least indicate why it is true for the elliptic curve /2 with 
equation y? = x° + x. Recall from Chapter 43 that the defect for this curve is 
given by the rules 


Ap = 0 if p = 3 (mod 4), and 
Ap = +2A if p = 1 (mod 4), where we write p = A? + B?. 
If a, = 0, there is little more to be said. On the other hand, if p = 1 (mod 4), then 


we Can estimate 
Jap] = 2A = 2\/p — B? < 2y/p, 


which is exactly Hasse’s Theorem. 

The final a, pattern that we discuss is so unexpected and unusual that you may 
be amazed that anyone noticed it at all. Indeed, it took many years and indications 
from many sources before mathematicians finally began to realize that this remark- 
able modularity pattern might be universally true. Although we are not able to give 
a full explanation of exactly what a modularity pattern is, we can convey the flavor 
by examining our representative elliptic curve 


E3:y? = 2° — 427 + 16. 
The other quantity that we look at is the following product: 
S=F dnd iT ae 
<i =7 = ee yr neh oe 


This product is meant to continue indefinitely, but if we multiply out the first fac- 
tors, we find that the beginning terms stabilize and don’t change when we mul- 
tiply by additional factors. For example, if we multiply out all the factors up to 
{(1 — T?8)(1 — Tena then we get 


C.= Ra a ROT eT ey or! aor eT ae 
2D A An a AI on Ag 
SE) ee Bee ees Rear e 


and these first 23 terms won’t change if we multiply by more factors. 
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At this point you are probably wondering what this product © could possibly 
have to do with the elliptic curve £3. To answer your question, here again is a list 
of the p-defects for £3 for all primes up to 23: 


az = 0, a3 = —1, a5 =1, a7 = —2, ail = 1 
a3 =4, a7=—-2, a@ig=0, gop =-1. 


Ignoring a2, can you see a relation between these a,’s and the product 0? When 
we write the product © as a sum, it appears that the coefficient of T’? is equal to ap. 
Amazingly, this pattern continues for all primes. 


Theorem 45.2 (Modularity Theorem for £3). Let £3 be the elliptic curve 
E3:y? = 2° — 4x7 + 16, 
and let © be the product 
fea 4 lena Geman y ae Wel eee 
x {( ey ne Ve {(1 psa) ies 
Multiply out © and write it as a sum 
© = eT + coT? + cgT? + c4T* + c5T? + 
Then for every prime p > 3, the p-defect of E3 satisfies ay = Cp. 


In the 1950s, Yutaka Taniyama made a sweeping conjecture concerning modu- 
larity patterns, and during the 1960s Goro Shimura refined Taniyama’s conjecture 
to the assertion that every elliptic curve should exhibit a modularity pattern. André 
Weil then proved a Converse Theorem that helped the conjecture of Shimura and 
Taniyama to gain widespread acceptance. 


Conjecture 45.3 (Modularity Conjecture). (Shimura, Taniyama) Every elliptic 
curve E; is modular. That is, the p-defects of E: exhibit a modularity pattern. 


What does it mean to say that the p-defects of an elliptic curve & “exhibit a 
modularity pattern?” It means that there is a series 


O=aqT+ coT? + c3T? + caT* + csT” + 


such that for (most) primes p, the coefficient c, equals the p-defect a, of E, and 
such that © has certain wonderful transformation properties that are unfortunately 
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too complicated for us to describe precisely.! Despite this lack of precision, I hope 
that the O for £3 helps to convey the flavor of what modularity means. 


Exercises 
45.1. In this exercise you will look for further patterns in the coefficients of the product O 
described in the Modularity Theorem for 3. If we write O as a sum, 

O =e bel? + cel? eT? Lot’ +s: 


the Modularity Theorem says that for primes p > 3 the p™ coefficient Cp is equal to the 
p-defect a, of E3. Use the following table, which lists the c, coefficients of © for all 
n < 100, to formulate conjectures. 


ef 2 A LA oe E  8 
Co} 1 |—2| -1/ 2) 1] 2 |—2] 0 | —2 [=2| 1 | =2| 4 | 4 | 1 4 | = 2 | 
n | 18|19| 20 |21/ 22 | 23| 24] 25 | 26 | 27| 28 | 29 | 30 | 31 | 32 | 33 | 34 
[ec Seas oo aa eae see ao ales es a | ca 

n | 35 | 36 | 37 [38| 39 | 40 | 41 | 42 | 43 | 44[ 45 | 46 | 47 | 48 | 49 | 50 | 51 

Le) = 24) 8c) 0° Ea | Ss aa 6. os =o oe 4 =3 829 

| n [52 [53] 54 [55] 56] 57] 58] 59| 60 [61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 

| ex [RS Sree | ye eo a | eas | ea 7 at 
| 


(a) Find a relationship between cp, Cn, and Cmn when gcd(m, n) = 1. 
(b) Find a relationship between c, and c,2 for primes p. To assist you, here are the values 
of cp2 for p < 37. 


C22 = 2, C32 = —2, C52 = —4, C72 = —3, 
C12 = ilk C132 = 3, C172 = —13, Ci92 = —19, 
C932 = —22, C292 = —29, C312 = 18, C372 = —28 


[Hint. The prime p = 11 is a bad prime for £3, so you may want to treat c,,2 as 
experimental error and ignore it! | 


‘For those who have had some complex analysis, here is the main part of the modularity condi- 
tion. We think of © as being a function of T’, and we set f(z) = @(e?*"*). Then there is an integer 
N > 1 such that if A, B,C, D are any integers satisfying AD — BC'N = 1, then the function f(z) 
satisfies 


CNz+D 
for all complex numbers z = x + ty with y > 0. 


f (#4) = (CNz + D)?f (2) 
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(c) Generalize (b) by finding a relationship between various c,x’s for primes p. To assist 
you, here are the values of c,x for p = 3and5and1<k <8. 


Car — 1. c32 = —2, Cas =D, Cue 1 

c35 = —16, czs = 18, (37.90, c3s = —TA, 
ent 1. C52 = —4, Css = —9, oy eae Ie 
Css = 56, exe "1, C57 = —279, Css = —284. 


(d) Use the relationships you have discovered to compute the following c,, values: 
(i) cao Gi) C29 (ili) C1521 ~~ (AV) €16807- 


45.2. In this exercise we look at the modularity pattern for the elliptic curve 
E:y%=a23 +1. 
The p-defects for F are listed in Exercise 43.5. Consider the product 
@ =T(1—T*)4(1 — T7*)4(1 — T3*)4(1 — T4*)4... 
(a) Multiply out the first few factors of 0, 
© = oT + coT? + c3T? + c4T* + c5T? + cgT? + °°: 


Try to guess what value of k makes the c,’s equal to the a,,’s of E. 

(b) Using your chosen value of k from (a), find the values of ci, c2,..., Cis. 

(c) If you’re using a computer, find the values of c1, C2, ..., C199. How is the value of c91 
related to the values of c7 and c,3? How is the value of c4g related to the value of c7? 
Make a conjecture. 


45.3. The product 
FA) Sa IS A) 1 KOI A= HO) es 


is useful for describing modularity patterns. For example, the modularity pattern for the 
elliptic curve £3 is given by 9 = T - f(T)? - f(T"). Now consider the elliptic curve 


yi =o? —2? —4¢4+4. 
It turns out that the modularity pattern for this curve looks like 
B= Po fT) FO FE FE) 


for certain positive integers 7,k,m,n. Accumulate some data and try to figure out the 
correct values for j,k, m,n. (You'll probably need a computer to do this problem.) 


Chapter 46 


Elliptic Curves and Fermat’s 
Last Theorem 


Fermat’s Last Theorem says that if n > 3 then the Diophantine equation 
A” we B” = Oi 


has no solutions in nonzero integers. We proved in Chapter 30 that there are no 
solutions when n = 4. We also observe that if p | n, say n = pm, and if 
AYP ==.G" then 

(A™)P + (BMP = (CM). 
So if Fermat’s equation has no solutions for prime exponents, then it won’t have 
solutions for nonprime exponents either. 

The history of Fermat’s Last Theorem was briefly discussed in Chapter 4. It is 
probably fair to say that most of the deep work on Fermat’s equation done prior to 
the 1980s was based on factorization techniques of one type or another. In 1986 
Gerhard Frey suggested a connection between Fermat’s Last Theorem and elliptic 
curves that he thought might give a new line of attack. 

Frey’s idea was to take a supposed solution (A, B, C’) to Fermat’s equation and 
look at the elliptic curve 


Eap:y? =2(¢ + A?)(x — BP). 


This elliptic curve is now called the Frey curve in his honor. The discriminant of 
the Frey curve turns out to be 


A(E4,p) = A? B??(AP + BP)? = (ABC)”, 


a perfect 2p""-power. This would be, to say the least, a trifle unusual. In fact, it 
would be so unusual that Frey suggested such a curve could not exist at all. More 


[Chap. 46] Elliptic Curves and Fermat’s Last Theorem 395 


precisely, he conjectured that £4 3 would be so strange that its p-defects could not 
exhibit a modularity pattern. Frey’s conjecture was put into a more refined form by 
Jean-Pierre Serre, and in 1986 Ken Ribet proved that a Frey curve coming from a 
solution to Fermat’s equation would indeed violate the Modularity Conjecture. In 
other words, Ribet proved that if A? + B? = C? with ABC’ ¥ 0, then the Frey 
curve F/, p is not modular. 

Inspired by Ribet’s work, Andrew Wiles devoted the next six years to proving 
that every (or at least most) elliptic curves exhibit a modularity pattern. Ultimately, 
he was able to prove that every semistable elliptic curve exhibits a modularity pat- 
tern, and this is enough because the Frey curves turn out to be semistable.! We can 
now proceed as follows: 


Proof (Sketch) of Fermat’s Last Theorem 


Let p > 3 be a prime, and suppose that there is a solution (A, B, C) to 
A? + BP = C? with A, B,C nonzero integers and gcd(A, B,C) = 1. 
Let E'4,p be the Frey curve y? = x(x + A?)(x — B?). 


Wiles’s Theorem tells us that £4 2 is modular; that is, its p-defects ap 


follow a modularity pattern. 


Ribet’s Theorem tells us that E’, p is so strange that it cannot possibly 
be modular. 


The only way out of this seeming contradiction is the conclusion that 
the equation A? + B? = C? has no solutions in nonzero integers. UO 


It is here, at the successful resolution of this most famous problem in mathe- 
matics, that we end our voyage through the Seven Seas of Number Theory. I hope 
you have enjoyed the tour as much as I have enjoyed being your guide and that you 
have found much to admire and much to ponder in this most beautiful of subjects. 
Above all, I hope that you have gained a sense of mathematics as a living, growing 
enterprise, with many wonderful treasures already discovered, but with many oth- 
ers, even more wonderful, waiting just over the horizon for the person having the 
insight, the daring, and the perseverance to sail into the unknown. 


‘An elliptic curve is semistable if, for every bad prime p > 3, the p-defect ap is equal to +1. 
There is also a more complicated condition if the prime p = 2 is bad, but luckily it turns out that the 
Frey curves can be transformed so that 2 becomes a good prime. 


Further Reading 


Here are some books to assist you in your continuing study of Number Theory. 


The Higher Arithmetic, H. Davenport, Cambridge University Press, Cambridge, 1952 (7th 
edition, 1999). 
A beautiful introduction to number theory, covering many of the same topics as this 
book, but written in a more rigorous style. Highly recommended. 


The following four books are standard introductions to number theory. They each include 
more material than we have been able to cover. The book of Ireland and Rosen uses 
advanced methods from abstract algebra. 


An Introduction to the Theory of Numbers, G.H. Hardy and E.M. Wright, Oxford Univer- 
sity Press, London, 1938 (6th edition, 2008). 


A Classical Introduction to Modern Number Theory, K. Ireland and M. Rosen, Springer— 
Verlag, NY, 1982 (2nd edition, 1990). 

An Introduction to the Theory of Numbers, 1. Niven, H. Zuckerman, and H. Montgomery, 
John Wiley & Sons, NY, 1960 (Sth edition, 1991). 

A Course in Number Theory, H.E. Rose, Clarendon Press, Oxford, 1988 (2nd edition, 
1994). 


The remaining volumes in our list cover specific topics in more depth. 


The Little Book of Primes, P. Ribenboim, Springer-Verlag, NY, 1991. 
A delightful compendium of primes of all sizes and shapes. 


13 Lectures on Fermat’s Last Theorem, P. Ribenboim, Springer-Verlag, NY, 1979. 
Fermat’s Last Theorem through the centuries, up to, but not including, the break- 
through proof of Wiles. 


An Introduction to Mathematical Cryptography, J. Hoffstein, J. Pipher, and J.H. Silverman, 
Springer—Verlag, NY, 2008. 
Public key and private key cryptosystems explained, with accompanying background 
in algebra and number theory. 

Introduction to Analytic Number Theory, T. Apostol, Springer-Verlag, NY, 1976. 
Studying number theory via methods from calculus. 

Rational Points on Elliptic Curves, J.H. Silverman and J. Tate, Springer-Verlag, NY, 1992. 
Number theory and elliptic curves, including proofs of special cases of the theorems 
of Mordell, Hasse, and Siegel. 
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